CPU Spikes and FPS drops When Using Videos Playing on Virtual Screens

 
Hey,

We've recently noticed that Aximmetry reporting usage of around 65% GPU and around 70%+ CPU load, when Windows task manager CPU and GPU usage seems to be only around 15-20%.We are using a RTX 3090 GPU and AMD Ryzen 5950x CPU clocked to 4300 MHz.

Is Aximmetry not using multiple threads?
Lately we've been using some new UE4 environments with Aximmetry that are very well optimized and perform well, that is until we connect multiple videos (3+) from Aximmetry to the environment virtual screens.We found that connecting the videos causes big CPU performance spikes (from around 40% stable to 68% with spikes to 80%+ every couple seconds) and as the CPU performance spikes, the fps drops from 25fps to below 20fps.We're also encountering the same CPU performance issue when trying to send NDI from aximmetry.

Any help as to how to improve this performance problem would be greatly appreciated.

Thanks.

   guyamit

 
Profile Image
TwentyStudios
  -  

Try using a different codec for the videos and make sure that your videos are the same frame rate as your project settings. It could even be a slow disk issue, so try enabling buffering for the videos and see if that helps. 

 
Profile Image
guyamit
  -  

Transcoding the videos from mp4 to DNxHR did helped. But still, Aximmetry do not utilizing a majority (80%!) of CPU power available. Encoding NDI stream for example is useless too, because CPU usage in Aximmetry is very high, while Windows reporting about only 20% usage. 4k Projects running nativity on UE on 120fps will hardly run 25fps on Aximmetry. I guess even investing in threadripper will not help, maybe even make it wors as clock frequency per core is lower compared to Ryzen 5950x.

Processor affinity under Windows task manager is set to use all processors and settings the priority to high or realtime did not helped either.  What do you think should we do?

 
Profile Image
TwentyStudios
  -  

Aximmetry is not optimized for multithreaded optimization, so higher single core performance is always better. Still, a difference between 120fps to 25fps doesn’t seem right and is not what we’re seeing on our systems at all. 

Beyond just running the UE4 scene there are a lot of overhead with keying,  different render passes, playing back videos, capturing live video and outputting to a video board. 

 
Profile Image
guyamit
  -  

Our keying is done externally by Ultimatte and we purchased an external recorder because Aximmetry was dropping frames on recording. Even a simple project without video layers cannot exceed more then 25fps on 4k. Does your system showing a stable fps count? Ours showing 25.x like its not locked.

Support wrote that Aximmetry do use multiple threads.

The computer is clean from other softwares and dedicated for Aximmetry. 

 
Profile Image
TwentyStudios
  -  

Aximmetry might use multiple threads for some tasks, but it’s not very multi-threaded. We run our projects in 4K 30p without any dropped frames. The frame rate meter will fluctuate slightly (fractions of a frame), but it doesn’t cause any drops or stutters. We have tested recording internally as well without any issues. An external recorder is of course still a better option since it saves on the CPU load. 

To me it sounds like you have a PCIE bandwidth issue or some other bottleneck that doesn’t show up on the CPU/GPU meter. If you’re sending key/fill from the Ultimatte in 4K while also outputting 4K and playing back video files in 4K you’re putting a serious strain on the limited bandwidth of the Rosen. The CPU meter in Aximmetry should easily be able to go to 70% without any dropped frames. What capture card are you using? Have you selected Sync on your main output of the capture card? I could try running your scene on our system to get you a comparison. 

 
Profile Image
guyamit
  -  

Thank you for your kind offer to help. We are using two genlocked Decklinks, one 8K (4 inputs), and second 4K (2 outputs). The sync is on.

The main thing is Aximmetry utilizing only 1/5 of the compute resources, not enabling us to use other CPU intensive tasks like encoding NDI or playing MP4 videos on the fly- without prior transcoding them. Do you encounter the same?

 
Profile Image
TwentyStudios
  -  

What I’m trying to explain is that Aximmetry can’t spread the CPU load evenly across all the CPU cores, so you can’t look at the combined load. If a process is running on a single core at 100%, that is going to be the performance limit, no matter how many unused cores you have. That is why Aximmetry recommends high single core performance over many cpu cores.

I also think you’re running into PCIe and memory bandwidth limitations due to the insane amount of image data being transferred to and from the Decklink cards. Adding video file playback on top of that it’s no wonder you’re seeing dropped frames. There is just too much data being passed between PCIE, CPU, GPU and RAM, especially on a Ryzen CPU.

 
Profile Image
guyamit
  -  

So it's an open question for Aximmetry, why can't it use more then one core, castrating so many needed and basic functions like NDI and playback of video files?

Keying with Aximmetry instead of sending it externally over SDI's, did not lower CPU/ GPU usage. 

 
Profile Image
TwentyStudios
  -  

I checked today and Aximmetry is definitely using more than one core. It’s just that it can’t split up a single, heavy task over several cores, which is expected for this type of real-time, low latency application. I’m sure it can still be more optimized, but it’s not as simple as that they are “castrating” basic functions just because you see unused CPU cycles in the performance monitor. 

We’re using Aximmetry with several NDI inputs, 4K live inputs and video file playback without any frame drops and with low CPU load, so o still think there is something else going on with your project or your setup. What’s the CPU load with just the UE4 scene in Aximmetry, no file playback or live inputs? My offer still stands to test your project on one of our workstations. 

 
Profile Image
grassfire
  -  

Hey guys, we are having similar issues. We unfortunately learned about Aximmetry's non optimized multithreaded CPU performance AFTER we invested in a 64 core Threadripper CPU. I'm curious if you have found any solutions? We find that our system runs fine until you open a menu in the program, open a small program, etc. However after closing the menu or small program Aximmetry never recovers.

Twenty Studios mentioned a PCI bandwidth issue BUT with Threadripper we have so many PCI lanes it would be fairly impossible for that to be the issue with our system.

our specs:

  • Threadripper 64 cores
  • 512GB RAM
  • 2TB NVME system drive
  • 4TB NVME media drive
  • RTX 3090 TI
  • Two Decklink 8k Pro cards
  • Windows 11 Pro


 
Profile Image
TwentyStudios
  -  

@Grassfire: Poor multithreaded performance isn’t limited to Aximmetry, Unreal is also bad at splitting tasks across multiple threads. That’s why high core Threadrippers aren’t recommended for real-time rendering in general. I’d say Aximmetry is particularly bad at multithreaded operation, since it has to do a lot more than just rendering the 3D scene. I’m hoping for a big push to optimize it soon.