World Community Grid - View Thread - The client doesn't account for the GPU thread...

World Community Grid Forums

Category: Active Research

Forum: OpenPandemics - COVID-19 Project

Thread: The client doesn't account for the GPU thread...

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 8

[ ]

Author

This topic has been viewed 6090 times and has 7 replies

DennyInDurham
Cruncher
USA
Joined: Aug 4, 2020
Post Count: 23
Status: Offline
Project Badges:

10 year badge for Mapping Cancer Markers

20 year badge for OpenPandemics - COVID-19


The client doesn't account for the GPU thread...

If I allocate 100% of CPU resources to the client, on this particular Hyperthreaded 6-core Xeon, it would run 12 work units at 100% CPU utilization (which is fine, it maximizes throughput). With the GPU work unit, there are now 13 running. This was OK when the GPU thread used minimal CPU, but the current ones are CPU-bound for a considerable time at the beginning, about 7.5 minutes with the CPU maxed out. Reducing the work units to 12 (i.e., CPUs to 95%) reduces the CPU-bound time to about 6.5 minutes, and maximizes the GPU utilization.

The client should probably do this automagically if GPU work units are going to be significantly CPU-bound.

[Apr 28, 2021 4:54:11 PM]

Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:

90 day badge for OpenPandemics - COVID-19


Re: The client doesn't account for the GPU thread...

1. dont ever run CPU to 100% setting in BOINC. that's just asking to overcommit the CPU and bog things down. remember, the OS still needs some CPU cycles to run things in the background. always try to leave at least 1 thread free.

2. are you running an nvidia GPU? using a full core per GPU task is normal for most nvidia-openCL applications. this doesn't necessarily mean the task is "CPU bound". probably just something setup wrong or not well optimized causing the increased GPU idle time. the CPU stays engaged no matter what the GPU app is doing.

3. BOINC doesn't properly count CPU use when considering values less than 1. I think the default CPU use estimate from WCG ships them as 0.9CPU - 1GPU. well to BOINC, that 0.9 really means 0.0. it adds up the CPU portion and chops it down to an integer, truncating the decimal remainder. so:
0.9 CPU = BOINC reserves 0 CPUs for the GPU task
0.9+0.9 = BOINC reserves 1 CPU for the GPU tasks
0.9+0.9+0.9 = BOINC reserves 2 CPU for the GPU tasks
and so on.

you can rectify this by running an app_config.xml file to force 1.0 CPU - 1.0 GPU. then BOINC will properly account for CPU usage of GPU tasks.

----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti

[Apr 28, 2021 5:21:01 PM]

DennyInDurham
Cruncher
USA
Joined: Aug 4, 2020
Post Count: 23
Status: Offline
Project Badges:


Re: The client doesn't account for the GPU thread...

1. MacOS on this old Mac Pro has been perfectly happy running at 100% for several months now... the box doesn't run anything but WCG, and occasional Remote Desktop to monitor WCG.

2. It's an AMD. The task is CPU-bound for several minutes (apparently setting things up). This started with the 1xxxx WUs.

3. I haven't felt the desire or need to get into BOINC plumbing. I'm only observing that things that worked optimally before GPU WUs now work suboptimally without a little tweaking.

[Apr 28, 2021 5:56:00 PM]

Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:


Re: The client doesn't account for the GPU thread...

Tweaking is always necessary to get things working more optimally. You’re not going to get what you want without tweaking just because of the nature of what you’re doing and how BOINC operates.

Just because it “worked fine” at 100% doesn’t mean it’s optimal. As you’ve found out. I’d still recommend getting out of the habit of running 100% and instead always leave some breathing room for the system. At least 1 thread.

----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti

[Apr 28, 2021 6:58:46 PM]

spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 234
Status: Offline
Project Badges:

50 year badge for Mapping Cancer Markers

1 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project


Re: The client doesn't account for the GPU thread...

My own experience with the Ryzen 7 1700 (8 cores/2 threads per core) is that setting things up for 15 CPU tasks leaves breathing room for 1 GPU task plus system overhead.

I'm currently running 14 CPU tasks/2 GPU tasks, configured for 0.5 CPU per GPU task. This seems to be humming along nicely.

[Apr 28, 2021 7:14:26 PM]

Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11816
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

45 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

90 day badge for Help Cure Muscular Dystrophy - Phase 2

14 day badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

90 day badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project

10 year badge for OpenPandemics - COVID-19


Re: The client doesn't account for the GPU thread...

I stopped running opng because all were erroring. However, when I was, the status was showing 0.17 CPU + 1 GPU on my Intel machine. That means that the units were using a whole GPU and only used the CPU for 17% of the time.

In other words, most of the time it was not using the CPU. I presume that meant that it was only using the CPU for uploading and downloading which ties in with the observation about higher CPU at the start.

I have for some time now been using all my 8 threads (4 cores hyperthreaded) for WCG and also my sole GPU for Einstein GRP4. That states in the Status 0.5 CPU + 1 GPU but I find that it hardly slows down the WCG units. The other Einstein projects say 1 CPU + 1 GPU and only allow 7 WCG units to run.

Mike

[Apr 28, 2021 7:27:14 PM]

Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:


Re: The client doesn't account for the GPU thread...

However, when I was, the status was showing 0.17 CPU + 1 GPU on my Intel machine. That means that the units were using a whole GPU and only used the CPU for 17% of the time.

That's not really what that means. the CPU use % is only an estimate provided by the project and has no bearing on what ACTUALLY gets used by the application. In other words, you cannot impact what the science application is doesn't by changing these values. changing it to .5 wont make it use a 50% of a core. I could set my nvidia cards to 0.5 and it'll still use a full CPU core.

what this value DOES do, is act as bookkeeping for BOINC when it calculates how many resources are available to run other tasks. with the caveat that BOINC truncates the values down to integers. so when BOINC wants to know how many free CPU cores it can use. that 0.17 really end up being 0. if you run 2 tasks, it will be 0.17+0.17 = 0.34 which is still 0 to BOINC. you'd need to run 6 tasks concurrently to get up over 1 so that BOINC actually reserves a free core for support of the task. if you run an 8-thread CPU with 0.17CPU-1GPU and tell it to use 100% of the CPU, it will spin up all 8 tasks on the CPU and then throw the GPU task on top of that. forcing everyone to fight for CPU resources that have been overprovisioned.

----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti

[Apr 28, 2021 9:05:34 PM]

DennyInDurham
Cruncher
USA
Joined: Aug 4, 2020
Post Count: 23
Status: Offline
Project Badges:


Re: The client doesn't account for the GPU thread...

(This is on MacOS) I've watched Activity Monitor carefully when the OPNG tasks were running... multiple minutes of CPU-bound followed by a surprising amount of CPU activity while the GPU is in use. For the five-digit WUs, a whole CPU allocated is absolutely appropriate.

The OS handles 100% CPU perfectly well, but reducing the number of allocated CPUs by 2 does cause the OPNG WU to run faster, and gives the OS a bit of headroom to operate.

[Apr 28, 2021 10:57:56 PM]

[ ]