Duplicate form submissions and how to handle them in Django

Let's have a look at duplicate form submissions, understand the problem and fix it with a nice server-side solution. It's a common problem that almost always needs to be addressed - whether you are building HTML-over-the wire or a backend/frontend architecture, you will have the same issues:

  • User clicks the same button twice.
  • User navigates back and re-submits a form that we already processed.
  • User reloads the response page and resubmits same POST/GET form data.

Typically, this is because of some disturbance in the network or your own server having hiccups that prompt the user to retry. The duplicate form submission can then trigger:

  • Duplicate database objects
  • Duplicate user notifications
  • Duplicate anything_really
  • False-positive UI validation errors because of uniqueness constraints (first submission was valid!)
  • Further unnecessary server-side errors in case of unhandled uniqueness errors

Likely, the worst part of this problem is already fixed with the "POST-response-redirect" pattern, where the post target responds with an HTTP 302 and sends the user to a "thank you" page. Since it's built into Django's generic form handling views, you've probably already done that part.

Desired properties

You can skip this if you don't care about what the solution aims to provide, but I'd love to hear back about anything that's been missed here!

  • MUST process duplicate form submissions exactly once.
  • MUST allow user to perform duplicate form submission and succeed journey.
  • MUST process non-duplicate form submissions (defining what "duplicate" means is the solution to this requirement).
  • MUST allow duplicate form data from different user sessions (same as above, but a good illustration).
  • MUST function across multiple application workers / containers / threads.
  • SHOULD give user a meaningful response when submitting duplicate form data.
  • SHOULD be generic and easy to reuse.

In terms of multiple application workers and containers, there are race-conditions in which I actually haven't solved the above 100%. UPDATE: See final discussion.

Client-side disabling the submit button onsubmit

Here is a really simple snippet that resembles a common solution. You can add to avoid users accidentally or deliberately clicking the submit button multiple times:

// When a form is submitted, disable all its submit buttons
$(document).ready(function() {
    $('form').submit( function(event) {
        $(this).find(':submit').attr('disabled', true);
    });
}

But STOP: What could possibly go wrong?

  • The user never gets a second chance to submit the form, in case they had connection issues. So we'd need a timeout or some error handling to restore it to enabled state.
  • If the user does submit but receives an error in the browser (network error, Wi-Fi login etc) and they press "Back" in the browser, the button is still disabled!
  • If you use name and value properties to know what button the user clicked, this will be disabled before submitting the form.

Because of all these issues, my opinion is that it's unfeasible to solve this with JavaScript, so let's just make the server understand if the form was submitted twice and give a meaningful response. It's a bit the same as when over-investing efforts in client-side validation - sure, it's nice (if implemented correctly), but you need to do it on the server anyways.

Database uniqueness constraints

Another preventive measure can be to add uniqueness constraints to your database. If you can do this, you should. And then you should handle these errors gracefully.

Here's some code that we might use to handle duplicate form submissions:

class MyCreate(CreateView):

     def form_valid(self, form)
         # Try to save the form, assume that IntegrityErrors from the database
         # are because we already saved the same form. 
         try:
             obj = super().form_valid(form)
         except IntegrityError:
             messages.info("This form was probably already saved.")
         return redirect("thank-you")

Notice though, that we don't actually know if it was the case that the form failed because it was a double-submission? We might also have form validation that kicks in before we try saving the object. In any case, handling the constraint error at this level, we risk informing the user that their form was saved - when it actually wasn't! We need to know that it was saved.

We can also do this while validating the form. In this case, we block the form and the user receives a validation error in the UI. But then the validation error might shadow the initial successful submission of the form! Remember that we are dealing with the problem of double-submissions.

So what can you do?

Let's get started with an approach for rejecting form POST data that we've already processed (it's not quite perfect though):

class MyCreateView(CreateView):

    def form_valid(self, form):

        # Unique key to store session for this view
        session_form_hash = f"form-submission+{self.__class__.__name__}"
        # Unique key for storing the pk when the object is created
        self.session_last_saved_pk = session_form_hash + "-pk"

        # Calculate hash of the POST data
        excluded = {
            "csrfmiddlewaretoken",
        }
        this_post_hash = hash(
            tuple(
                sorted(
                    (k, v) for k, v in self.request.POST.items() if k not in excluded
                )
            )
        )

        # Previously calculated hash
        previous_post_hash = self.request.session.get(session_form_hash)

        # Form has already been processed!
        if this_post_hash == previous_post_hash:
            # Fetch the previous object and return a helpful message to the user
            self.object = get_object_or_404(MyModel, pk=self.request.session[self.session_last_saved_pk])
            messages.warning(
                self.request, "This form was submitted several times. It has been processed just once."
            )
            return HttpResponseRedirect(self.get_success_url())
        else:
            response = super().form_valid(form)
            self.request.session[session_form_hash] = this_post_hash
            # self.object is defined when using ModelFormMixin
            self.request.session[self.session_last_saved_pk] = self.object.pk
            return response

We're almost there:

Before the form is saved, we start by hashing its POST data (excluding the csrfmiddlewaretoken which can change). We're sorting the keys before hashing, so we can say that the hashing method is consistent.

We store the hash with a session key that's unique for this particular view. This means that we can detect duplicates that are unique for the current user's session and we can also share it across application workers/threads/containers.

Finally, we check if any existing form POST data for this particular form and user was hashed. If not, we proceed to saving the form and marking it saved. If it was already saved, we inform the user that the form was saved on the thank-you page and add a message that we registered duplicate submissions.

But we only do this in the form_valid view. If that's good enough, you can stop here. But if you have constraints and validation rules that would make the second form submission invalid, it will trigger the final form submission to come back as invalid. The user won't know that the first form was correctly saved.

Icing on the cake 🍰️

We're gonna continue to hash and save our new object's PK in the form_valid() method, but we are going to move the duplicate check to the post() method. That way, we can avoid issues with invalid form data when it's due to uniqueness constraints kicking in. In fact, we just skip processing and validating the form all-together when we've already seen it.

Win-win! 💯️

class MyCreateView(CreateView):

    def post(self, request, *args, **kwargs):

        # Unique key to store session for this view
        self.session_form_hash = f"form-submission+{self.__class__.__name__}"
        # Unique key for storing the pk when the object is created
        self.session_last_saved_pk = session_form_hash + "-pk"

        # Calculate hash of the POST data
        excluded = {
            "csrfmiddlewaretoken",
        }
        self.post_hash = hash(
            tuple(
                sorted(
                    (k, v) for k, v in self.request.POST.items() if k not in excluded
                )
            )
        )

        # Previously calculated hash
        previous_post_hash = self.request.session.get(self.session_form_hash)

        # Form has already been processed!
        if self.post_hash == previous_post_hash:
            # Fetch the previous object and return a helpful message to the user
            self.object = get_object_or_404(
                MyModel,
                pk=self.request.session[self.session_last_saved_pk]
            )
            messages.warning(
                self.request, "This form was submitted several times. It hans been processed just once."
            )
            return HttpResponseRedirect(self.get_success_url())

        return super().post(request, *args, **kwargs)

    def form_valid(self, form):
        response = super().form_valid(form)
        self.request.session[session_form_hash] = this_post_hash
        # self.object is defined when using ModelFormMixin
        self.request.session[self.session_last_saved_pk] = self.object.pk
        return response

By storing the POST hash after the fact of validating and saving the form, we are guaranteed against wrongly blocking a form submission. And we'll know that it was previously validated and saved already before validating the next form submission, catching any duplicate form submissions.

What's next?

Because I think that any project would need to adjust this to their own definition of what a "double submission" is, I'm going to leave this here for now. But we might be able to brew together a nice view decorator.

UPDATE: There are a couple of edge-cases with respect to high load and concurrency that you might consider:

  • Race-conditions: Processing the form in the database can take time, leaving a gap for double submissions to enter while another submission is being saved in the database by another worker/thread/container.
  • Session and cache invalidation: We use a session to store form hashes. If you have a LOT of sessions, consider that the timeout for a session is much higher than what is needed to guard against duplicate form submissions. So you might use a cache item directly and base it off the current session ID, but setting a timeout to some amount of minutes.
  • In any case, remember to always run clearsessions.

I didn't have any large-scale or high load/concurrency use cases on my radar, but if you have any inputs, join the conversation in the Fediverse.