Hi David,
Again, thanks a lot for taking your time to investigate this. I will take a deeper look into this today, I am unsure what it is going wrong with my solver time but I am getting quite slower than 6x. I know what I will say next is not a fair comparison, but I just finished my scalar+fluid simulation yesterday. When I was solving only the fluid for the coronary geoemtry (default parameters in the tutorial) in svSolver, using 96proc (2 nodes - 2 Intel Xeon 8268 CPUs per node, Cascade Lake Platinum chipset - 24 cores per CPU @ 2.90 GHz) it took around ~15min for 6E3 external iterations.
But, now when I switched to svSolver and coupled with HF equation, using the same computational power took ~60hrs to finish. So I did an "scalability test" using 4, 12, 24, 48 cores in a cylinder geoemtry using just resistance BC, 1000 timesteps, then averaged the number of time steps the svFSIPlus was solving per minute, see below:
- scalabilityTest
- scalability.png (29.81 KiB) Viewed 778 times
So it seems to scale well with the number of procesors. So yesterday went ahead and I sent the same simulation I first attached (with the coronary BC) using 144 proc (3 nodes) and so far it has been ~16hrs and I got until 4.5E3 timesteps. I plotted each timestep for each solver (NS and HF) against time, and does not seem to be a gap between both, usually takes <1s to solve the HF loop. Also, all the threads are having a balanced load, it does not seem that any of the processors is clogging the multi-threading ops as it happened to me before.
- time_Solver_FSIPLus-HF-NS
- timeHF-NS-svFSI.png (34.53 KiB) Viewed 778 times
Lastly, I did some profiling using
on a single thread, (on a cylinder case), and taking a quick look, I didn't find anything suspicious, although I didn't do a deep dive into the profiling result. I am attaching the results for the functions that are taking >1%. I will try to change preconditioner as that may be taking some extra time.
I will do a deeper analysis today and report back if I find anything interesting. If you can let me know which MPI version you are using for svFSIPlus that would be great.
Code: Select all
--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
95,404,227,520 (100.0%) PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
25,481,833,114 (26.71%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSILS/omp_la.cpp:omp_la::omp_sum_v(int, int, double, Array<double>&, Array<double> const&) [/home/ibartol/svFSIplus-package/build/svFSI-build/bin/svFSI]
21,845,359,360 (22.90%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSILS/dot.cpp:dot::fsils_nc_dot_v(int, int, Array<double> const&, Array<double> const&) [/home/ibartol/svFSIplus-package/build/svFSI-build/bin/svFSI]
8,385,359,946 ( 8.79%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSILS/spar_mul.cpp:spar_mul::fsils_spar_mul_vv(fsi_linear_solver::FSILS_lhsType&, Array<int> const&, Vector<int> const&, int, Array<double> const&, Array<double> const&, Array<double>&) [/home/ibartol/svFSIplus-package/build/svFSI-build/bin/svFSI]
7,021,782,083 ( 7.36%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSI/fluid.cpp:fluid::fluid_3d_m(ComMod&, int, int, int, double, Array<double> const&, Vector<double> const&, Vector<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double>&, Array3<double>&) [/home/ibartol/svFSIplus-package/build/svFSI-build/bin/svFSI]
3,740,615,368 ( 3.92%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSI/fluid.cpp:fluid::fluid_3d_c(ComMod&, int, int, int, double, Array<double> const&, Vector<double> const&, Vector<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double>&, Array3<double>&) [/home/ibartol/svFSIplus-package/build/svFSI-build/bin/svFSI]
3,388,837,579 ( 3.55%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSI/heatf.cpp:heatf::heatf_3d(ComMod&, int, double, Vector<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double> const&, Array<double>&, Array3<double>&) [/home/ibartol/svFSIplus-package/build/svFSI-build/bin/svFSI]
2,202,734,458 ( 2.31%) ./malloc/./malloc/malloc.c:_int_free [/usr/lib/x86_64-linux-gnu/libc.so.6]
1,766,583,006 ( 1.85%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSI/Array.h:spar_mul::fsils_spar_mul_vv(fsi_linear_solver::FSILS_lhsType&, Array<int> const&, Vector<int> const&, int, Array<double> const&, Array<double> const&, Array<double>&)
1,680,982,663 ( 1.76%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSI/Array.h:gmres::gmres_s(fsi_linear_solver::FSILS_lhsType&, fsi_linear_solver::FSILS_subLsType&, int, Vector<double> const&, Vector<double>&)
1,605,408,588 ( 1.68%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSI/lhsa.cpp:lhsa_ns::do_assem(ComMod&, int, Vector<int> const&, Array3<double> const&, Array<double> const&) [/home/ibartol/svFSIplus-package/build/svFSI-build/bin/svFSI]
1,477,185,778 ( 1.55%) ./malloc/./malloc/malloc.c:malloc [/usr/lib/x86_64-linux-gnu/libc.so.6]
1,324,011,215 ( 1.39%) ./string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:__memset_sse2_unaligned [/usr/lib/x86_64-linux-gnu/libc.so.6]
1,217,267,390 ( 1.28%) /home/ibartol/svFSIplus-package/svFSIplus/Code/Source/svFSI/nn.cpp:nn::gnn(int, int, int, Array<double>&, Array<double>&, Array<double>&, double&, Array<double>&) [/home/ibartol/svFSIplus-package/build/svFSI-build/bin/svFSI]
755,234,208 ( 0.79%) ./malloc/./malloc/malloc.c:free [/usr/lib/x86_64-linux-gnu/libc.so.6]
Thanks a lot in advance!
Best,
Ignacio