Uploading job list is extremely slow #115

nclaidiere · 2022-04-15T07:32:24Z

Uploading job list is extremely slow, about 3 mins for a 5kb csv file (I stopped counting when I had to upload a 10k trial list, I did it overnight but I think it was more than 1h).

dschreij · 2022-05-12T20:33:21Z

I haven't experienced this actually! For me this is quite fast, even for larger files.
@smathot what are your experiences with this?

nclaidiere · 2022-05-13T06:09:56Z

Ok, so one question is does that depend on the number of participants? is the uploading creating jobs lists and session data, etc before assigning participants?

smathot · 2022-05-13T06:53:07Z

I have also never experienced this, but of course I've never worked with a database that's as populated as the one of @nclaidiere.

smathot · 2022-06-23T19:15:15Z

I am testing this now on the server of @nclaidiere and I can confirm that this is really slow, about 30 seconds for a table with 50 rows and 2 columns. (Whereas otherwise the server is fairly snappy.) I suspect that Study.processJobs() is the culprit.

dschreij · 2022-09-18T19:03:15Z

Did some research into this, after I also experienced bad performance during an automated test run of a study. During that run, each query took about two seconds (according to the OpenSesame console), while for @smathot executing was in millisecond domain (#92 (comment))
My assumption is that this process is so slow because of queries taking a lot longer than usual, and there are a lot of queries in this function.
I found quite some reports of mysql vastly underperforming when running in Docker (on Windows), for instance:

There are also some proposed solutions for which varying degrees of success are reported:

Add :delegated to the mysql data path mapping in docker-compose.
Map the docker data folder to a physical location on the disk. It currently points to a volume container, which is the preferred method, but other than that, there wouldn't me much difference for persistence between the two options.

I'll see if I have time to play around with these soon and see if they have any effect. If none of these work, we can also try to:

Replace MySQL with PostgreSQL, which is another popular and widely-used open-source database. I mainly work with it for current projects, and it works really well in Docker and on Windows. Since this project uses an ORM that supports Postgres out of the box. It shouldn't be that much work to replace (but may bring some surprises too).
Install and run the windows version of mysql locally, making Docker unnecessary. This involves some more steps when setting up and makes the system less portable, but may remedy this and other performance and memory issues.

smathot · 2022-09-19T07:32:52Z

My assumption is that this process is so slow because of queries taking a lot longer than usual, and there are a lot of queries in this function.

That does sound likely. That being said, uploading is also slow on servers for which the other queries go very fast.

kristian-lange · 2024-07-08T12:30:10Z

So, the culprit lays in Study.js in function processJobs.

If there are many job entries in the uploaded job file the following loop does an unholy amount of database requests that slows down the whole upload procedure:

    for (const [i, row] of Object.entries(jsonData)) {
      const job = await this.jobs().create({
        position: parseInt(i) + 1
      }, trx)

      await job.variables().attach(Object.keys(varTable), (record) => {
        record.value = row[varTable[record.variable_id]]
      }, trx)
    }

It has been difficult for me to improve speed in this loop due to:

my inexperience with the used technology (Adonis/Lucid)
Adonis/Lucid being used in an outdated version. The version being used according to package.json is 6.3 but the way it is is used in OOM is 4.x (https://legacy.adonisjs.com/docs/4.0/lucid) . As far as I can see 4.x does not allow real ORM which makes it more difficult to use.
an complex database scheme around studies, jobs, variables, and job_variable.

So far I reached an improvement of around half the time by simply wrapping the for loop with const trx = await Database.beginTransaction(); and await trx.commit(), which should prevent it from sending the database requests one-by-one but all at one in the end.

Maybe this is already enough. If not I have to revisit this issue and probably have to try to refactor the database scheme.

smathot · 2024-07-09T09:34:29Z

Thanks for this!

Maybe this is already enough. If not I have to revisit this issue and probably have to try to refactor the database scheme.

Why don't you try uploading a list of say 1000 rows? If this works within a reasonable time frame, then that's ok.

Some background: Once we've refactored the database as described under #155 then it is no longer necessary to be able to upload extremely long job lists of tens of thousands of rows. The reason for such long lists would be to implement repeating trials through this route, rather than resetting job states, to avoid triggering database issues.

This means that our goal is now to make uploading of regular-sized lists fast enough so that it doesn't significantly impair usability. How long does 1000 rows take for you?

kristian-lange · 2024-07-09T11:54:51Z

Why don't you try uploading a list of say 1000 rows? If this works within a reasonable time frame, then that's ok. ...
How long does 1000 rows take for you?

What would you consider a reasonable time frame here? And is the content of the job file any important. I currently use a simple job file with entries like:

seq,firstname
1,Charlie
2,Adele
3,Bill
4,Josie
5,Louisa

With this data it takes the following to upload a job file:

1000 jobs: ~6s
10.000 jobs: ~60s

But those times aren't much different to how it was before.

Some background: Once we've refactored the database as described under #155 then it is no longer necessary to be able to upload extremely long job lists of tens of thousands of rows. The reason for such long lists would be to implement repeating trials through this route, rather than resetting job states, to avoid triggering database issues.

I'm actually on #155 now and try to think about a suitable database schema. I thought #155 is about refactoring the schema for the tables that store the resulting job data, namely table job_states that has the field data of type JSON that contains the result data. But the jobs that are uploaded are held in different tables, namely jobs, job_variable, and variables. I made a visualization of the database schema mostly for my own understanding.

tables responsible for storing jobs: jobs, job_variable, and variables and linking to studies
tables responsible for storing job results: job_states and job_statusses and linking to participants

I assume that a refactoring of the job result tables would have no impact of the job tables and therefore no impact on the upload times that this ticket is about. Am I assuming right?

Maybe my confusion stems from the word 'trial', that does not exist in the database schema. Can you maybe define 'trial' for me again? Is it equal to a job result?

Also, maybe we should continue the discussion in #155.

smathot · 2024-07-11T07:36:42Z

With this data it takes the following to upload a job file: 1000 jobs: ~6s 10.000 jobs: ~60s But those times aren't much different to how it was before.

Those times are reasonable, but don't correspond to what @nclaidiere experienced on their server. This may be because the performance of MySQL is very different in their environment, which is a Windows system running Docker. What kind of environment are you using to test this?

I assume that a refactoring of the job result tables would have no impact of the job tables and therefore no impact on the upload times that this ticket is about. Am I assuming right?

That is correct.

Maybe my confusion stems from the word 'trial', that does not exist in the database schema. Can you maybe define 'trial' for me again? Is it equal to a job result?

Yes. In most experiments, a trial would equal a job. And the result on a trial would equal a job result. So you can treat them as synonymous for this purpose.

kristian-lange · 2024-07-11T11:11:31Z

Those times are reasonable, but don't correspond to what @nclaidiere experienced on their server. This may be because the performance of MySQL is very different in their environment, which is a Windows system running Docker. What kind of environment are you using to test this?

I use my Linux laptop with 32GB memory and a good CPU. Maybe the MySQL container in the production setup has some constrains. With Docker one can limit the resources that Docker gives each container. Or the overall host doesn't have much resources. Do you know the numbers of the production system?

Another reason are the simple job data I used with just two columns. I did a new test, this time with 11 columns:

seq,firstname,lastname,age,street,city,state,zip,dollar,pick,date
1,Vincent,Reid,50,Ruwce Parkway,Ursonib,LA,86135,$1297.26,YELLOW,05/23/2024
2,Kenneth,Goodman,64,Vovmow Park,Bujoba,MO,74213,$842.37,BLUE,03/22/1937
3,Travis,Higgins,60,Foba Manor,Zomleaki,MT,43770,$9467.29,YELLOW,06/21/1932
4,Minerva,McCoy,56,Avenu Drive,Lodagab,MA,22653,$4011.00,GREEN,02/08/2016
5,Virginia,Howard,37,Jutebe Court,Idtiwmen,NV,44747,$9778.00,GREEN,01/22/1914
...

and it took longer:

1000 jobs: 31s
10.000 jobs: 104s

If Nicolas has even more complex job data it can take even longer.

Ah, now I re-read the Nicolas' original statement: "Uploading job list is extremely slow, about 3 mins for a 5kb csv file (I stopped counting when I had to upload a 10k trial list, I did it overnight but I think it was more than 1h).". My job data files' size from today's test are 76kB and 769kB. So, it is substantially faster on my system. I have three ideas for the cause:

Like you stated already the production system might have some constraints. We need to know the system setup there: CPU, memory, disk (free and used).
The MySQL container might have some limitations.
Another thing that comes to my mind: MySQL has a binary log that is often activated by default in newer versions but that can eat up a substantial amount of resources. In most cases the binary log is never used and instead because there are other backups available and it can be safely deactivated.
The database might slow down the more data are stored in it. This would hint at a problem with the database schema. A well-known problem causing this symptom is a missing index, although I couldn't see a missing index in the schema so far. But our schema is definitely on the complex side.

So first we need some help from @nclaidiere: Can you please give us the metrics for memory (free and used), disk (free and used), and CPU on the production system? Are there other applications running on the host? And can you please check if the binary log is activated on the MySQL? And last, do you impose limits on MySQL docker container?

smathot · 2024-07-12T08:35:24Z

Ok, @nclaidiere we'll wait for your input on this.

nclaidiere · 2024-07-15T06:59:00Z

Hi, I am back from a break.

I did some tests, on my windows laptop a 1000 rows x 10 columns job list takes 1 mins to upload, on the now more powerful linux server (128GB RAM, Intel® Xeon(R) Silver 4216 CPU @ 2.10GHz × 64) it takes about 27 minutes.

Can you please give us the metrics for memory (free and used), disk (free and used), and CPU on the production system?
Memory (RAM, DD) is largely free > 80%.

Are there other applications running on the host?
Nope, it is used only for this

And can you please check if the binary log is activated on the MySQL?
Binlog is activated but i purge the logs very quickly, this has been the source of many probems. Here is the relevant section of docker-compose:
services:
mysql:
container_name: mysql
image: 'mysql:8'
command: --sort_buffer_size=1G
restart: always
ports:
- 3306:3306
volumes:
- mysql-data:/var/lib/mysql
environment:
MYSQL_ROOT_PASSWORD: 4D#BF_RPtrQ3Q2n=VuNR!k5WvhLsmB#8
MYSQL_DATABASE: omm
MYSQL_USER: omm
MYSQL_PASSWORD: LhWxqJ^?pn@6AK+md6wZ_hsA=Qz52DSm
command:
- --sort_buffer_size=5M
- --innodb_redo_log_capacity=1073741824
- --binlog_expire_logs_seconds=86400

And last, do you impose limits on MySQL docker container?
I don't think so, hence the usage of all the available memory when the binlogs are not purged.

I m not sure but I have a feeling this also depends on the number of participants, 6 on the laptop, 36 on the server. If that is the case, then it is a problem since the number of participants can easily increase in the order of a few 1000s.

Hope this helps!

dschreij · 2024-07-15T08:16:42Z

Hi guys, This makes me remember that from a certain version WSL2 (on which Docker is running) Windows has started to impose a memory restriction on the linux environment. It can only use max 8GB of system memory by default. You can alter this setting by placing a .wslconfig file in your user directory (on Windows! E.g C:\Users\Someone) with the following contents:

…

------ # Settings apply across all Linux distros running on WSL 2 [wsl2] # Limits VM memory to use no more than 4 GB, this can be set as whole numbers using GB or MB memory=16GB

------ My machine has 32GB of RAM so I gave it half here, but you can give it more of all of your system memory if the main job of the machine is to run omm-server

________________________________ From: Nicolas Claidiere ***@***.***> Sent: Monday, July 15, 2024 8:59 AM To: open-cogsci/omm-server ***@***.***> Cc: Daniel Schreij ***@***.***>; Assign ***@***.***> Subject: Re: [open-cogsci/omm-server] Uploading job list is extremely slow (Issue #115) Hi, I am back from a break. I did some tests, on my windows laptop a 1000 rows x 10 columns job list takes 1 mins to upload, on the now more powerful linux server (128GB RAM, Intel® Xeon(R) Silver 4216 CPU @ 2.10GHz × 64) it takes about 27 minutes. Can you please give us the metrics for memory (free and used), disk (free and used), and CPU on the production system? Memory (RAM, DD) is largely free > 80%. Are there other applications running on the host? Nope, it is used only for this And can you please check if the binary log is activated on the MySQL? Binlog is activated but i purge the logs very quickly, this has been the source of many probems. Here is the relevant section of docker-compose: services: mysql: container_name: mysql image: 'mysql:8' command: --sort_buffer_size=1G restart: always ports: - 3306:3306 volumes: - mysql-data:/var/lib/mysql environment: MYSQL_ROOT_PASSWORD: 4D#BF_RPtrQ3Q2n=VuNR!k5WvhLsmB#8 MYSQL_DATABASE: omm MYSQL_USER: omm MYSQL_PASSWORD: ***@***.***+md6wZ_hsA=Qz52DSm command: - --sort_buffer_size=5M - --innodb_redo_log_capacity=1073741824 - --binlog_expire_logs_seconds=86400 And last, do you impose limits on MySQL docker container? I don't think so, hence the usage of all the available memory when the binlogs are not purged. I m not sure but I have a feeling this also depends on the number of participants, 6 on the laptop, 36 on the server. If that is the case, then it is a problem since the number of participants can easily increase in the order of a few 1000s. Hope this helps! — Reply to this email directly, view it on GitHub<#115 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAFIHFHJG2AMPETJCMCWMULZMNXMXAVCNFSM5TP2CQQKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMRSG44DCNRXGI3Q>. You are receiving this because you were assigned.Message ID: ***@***.***>

smathot assigned dschreij Jun 24, 2022

smathot added the enhancement New feature or request label Jun 24, 2022

smathot mentioned this issue Aug 5, 2022

'Out of sort memory' error when downloading data #92

Closed

nclaidiere mentioned this issue Feb 13, 2024

OMM can not be accessed anymore #152

Closed

smathot unassigned dschreij Apr 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uploading job list is extremely slow #115

Uploading job list is extremely slow #115

nclaidiere commented Apr 15, 2022

dschreij commented May 12, 2022

nclaidiere commented May 13, 2022

smathot commented May 13, 2022

smathot commented Jun 23, 2022

dschreij commented Sep 18, 2022 •

edited

Loading

smathot commented Sep 19, 2022

kristian-lange commented Jul 8, 2024

smathot commented Jul 9, 2024

kristian-lange commented Jul 9, 2024

smathot commented Jul 11, 2024

kristian-lange commented Jul 11, 2024

smathot commented Jul 12, 2024

nclaidiere commented Jul 15, 2024

dschreij commented Jul 15, 2024 via email

Uploading job list is extremely slow #115

Uploading job list is extremely slow #115

Comments

nclaidiere commented Apr 15, 2022

dschreij commented May 12, 2022

nclaidiere commented May 13, 2022

smathot commented May 13, 2022

smathot commented Jun 23, 2022

dschreij commented Sep 18, 2022 • edited Loading

smathot commented Sep 19, 2022

kristian-lange commented Jul 8, 2024

smathot commented Jul 9, 2024

kristian-lange commented Jul 9, 2024

smathot commented Jul 11, 2024

kristian-lange commented Jul 11, 2024

smathot commented Jul 12, 2024

nclaidiere commented Jul 15, 2024

dschreij commented Jul 15, 2024 via email

dschreij commented Sep 18, 2022 •

edited

Loading