Simplify your stack and build anything. Or everything.
Build tomorrow’s web with a modern solution you truly own.
Code-based nature means you can build on top of it to power anything.
It’s time to take back your content infrastructure.

Writing ~20k documents to MongoDB every 5 minutes - upsert/bulk processing, optimizations?

default discord avatar
hristo60044 months ago
6

Hi,



I have a project where I'm reading external APIs and writing to the database every 5 minutes. The data is 99% fresh, I have a Redis layer, but it doesn't do much, since the interval is 5 minutes and the data changes in the meantime.



Now my logic is basically the following:


Get ~20k objects from API → Loop through each → Check Redis if it exists/needs to be updated or created → Update DB



So this basically ends up in a loop of creating 5% and updating 95% of those 20k documents. And it takes about 2 minutes on a 32GB RAM, 3.3 GHz 16-core CPU, 2TB SSD server.



My question is, should I write directly to the database using

payload.db

? Are there any upsert/bulk operations with Payload (I couldn't find)? Any optimizations that I could do? I think if I write directly to the database it will be faster, but then I'm losing the safety and validation of Payload. Am I thinking about this right? Thanks!



Wow, I refactored my code to use MongoDB directly (or through

payload.db.

) and before, importing of 10k documents lasted up to a minute, now it's done in less than 4 seconds. 🤯

  • default discord avatar
    rubixvi4 months ago

    ofcourse its faster, cause it no longer needs to check safety or run any hooks. using .db. is basically just using mongo's commands

  • discord user avatar
    alessiogr
    4 months ago

    Especially for something like

    payload.update

    vs

    payload.db.updateMany

    this will be very noticeable, because

    payload.update

    has

    to fetch and update documents one by one to run hooks for every single document affected in the update.

  • default discord avatar
    hristo60044 months ago

    I was thinking this is the reason and I'm guessing by design Payload can't support bulk operations, right? Because it

    has

    to do all of the validation, access and hooks for each document?

  • discord user avatar
    alessiogr
    4 months ago

    Yep, they are ways we could optimize it further in the future , e.g. we could try to batch some updates where data is identical, but it will never be close to using payload.db.*

  • default discord avatar
    sieudino.4 months ago

    the payload UI looks good, is that custom one you built? Can i have the source 😭

  • default discord avatar
    hristo60044 months ago

    Hi, yeah, it's a custom one. I can't provide a source for now, because it's part of a client project and I'll have to prepare a clean copy of it for the UI only. But all in all, it's not complicated to do. Basically three things:



    1) Custom CSS


    2) Used the

    admin.custom

    for each collection/global where I can insert custom data and then read that in a custom Sidebar component that sorts the globals and collections, and adds icons


    3) Used custom component for the dashboard, which is in React, so it can be as custom as you want it



    For example:



    admin: {
      custom: {
        order: 2,
        iconPath: '/admin/icons/settings.png',
      },
      group: 'Importer',
    },
Star on GitHub

Star

Chat on Discord

Discord

online

Can't find what you're looking for?

Get dedicated engineering support directly from the Payload team.