Reporting on Multiple Runs

Info

This post was imported from a personal note. It may contain inside jokes, streams of consciousness, errors, and other nonsense.

The problem I want to address is that when I make a change and run 30 generations, the output is not necessarily a good indicator of the effect of that change. The evolution of that strain may have had an unusually large impact from boid starting weights or food positions or the mutations. A good way to mitigate that is to do lots of runs.

I’d like to push a button and have 10-20 runs executed. Parallelized so it doesn’t take 20-40 minutes. Then the results output to a single line chart showing their performance and how they converged.

Matplotplusplus looks great. https://github.com/alandefreitas/matplotplusplus

Another option is Gnuplot. http://www.gnuplot.info/

I’m a little wary of implementing this stuff in C++, though. Development in C++ is not very fast, especially for a noob like me. I figure the performance I get using C++ is great for running the simulation and the agents’ neural networks makes the tradeoff worth it. For collecting data and plotting charts, though, I don’t get that performance benefit and the cost of development becomes too high.

Plus I need a refresher on Python and matplotlib. (-:

So do I invoke the C++ executable from Python or create a library with Python bindings?

Bindings look cool. Convenient for sure. More work to setup but then it should be easier to use. https://github.com/pybind/pybind11

Invoking the executable would mean I need to figure out how to get the results back to Python. Doing it on stdout would suck.

Ah. “Interprocess communication.” https://stackoverflow.com/a/14589269 - this answer recommends using sockets but provides an example for Python to Python https://stackoverflow.com/a/6915636 - this answer recommends zeromq for messaging and protobuf for serialization https://stackoverflow.com/a/58169769 - Shared memory is recommended but that’s for performance and I don’t need that https://cfanatic.medium.com/non-blocking-ipc-between-python-and-c-d7f53ff992b8 - an article on using named pipes but it’s using Boost:ASIO so no thanks

ZeroMQ docs are hilarious. I wonder if it’s not more than I need, tho. SO answers & comments do seem to point out that it’s easy to setup. The examples look compelling.

I’ll need to run the simulation as a server. Parallelization will need to be handled on the C++ side (I can’t just use Python to invoke the app multiple timeswould that even work?) but that’s probably better.

Looks like Python _can_ invoke multiple executables in parallel. https://stackoverflow.com/a/36956286

Maybe Python would start the ZeroMQ server and the simulation executables would send messages to it. Feels weird to have Python orchestrating the simulation instances but it’d be much faster to implement than writing C++ code to run simulations in parallel. I have zero need for communication between the simulation runs and don’t think I ever will. Even if I did there’s probably a way to do it with ZeroMQ.

Rubber Ducking
#

Hey, question. I’ve made a C++ app that runs a simulation which I need to be fast. But development in C++ is slow, especially for a noob like me. I’m writing a thing to run the simulation dozens or hundreds of times and generate reports / charts. Thinking of doing that in Python.

So I would run a python script and it’ll invoke the simulation executable a bunch of times with some parameters and receive the results. It’ll aggregate the results and produce a chart.

If that doesn’t already sound stupid and/or crazy then I have a follow-up question about the details:

Python can invoke the executable multiple times at once, giving me an easy way to run the simulation in parallel (on different cores, too, I believe?). It feels a little weird having the orchestration on the Python side but these simulation runs never need to talk to each other. I probably don’t have as much control over the processes this way, too, but it saves me writing a bunch of code in C++.

Another issue is communication. Using stdout to return the data would kinda suck because I’m sure I’ll inadvertently pollute it with log messages or something. And I kinda want those log messages. ZeroMQ looked small and simple enough to be a good option. Likely, the Python script would start up a message server and then the executables would all put their data on that.

I think a more typical setup would be the reverse: the C++ simulation starts up a server and my Python script sends a message asking for the simulation to be run. But I’m not gonna have multiple Python scripts connecting to this thing. Rather the other way around, multiple simulations connecting to the Python script.

Right?

Hmm. One thing that comes to mind is that I won’t be able to iterate quickly on the reporting if it’s tied to running the simulation every time anyway. Maybe it makes more sense to run the simulation and save a bunch of data files. Then I can pick those up with Python and tweak my reporting or explore the data on the fly. It won’t have the “one button” feature but I’ll have way more flexibility if that one run doesn’t produce the report that I wanted.

Organize the files:

build0000/
strain0000.csv
strain0001.csv
strain0002.csv

Need to give the strain number up front so the simulations aren’t fighting over where to write their output.

I think I have a picture of what I’m building now. Not as sexy as the initial idea but this’ll work well for my needs.

Tasks?

Pass a strain or a run

Hold on. Definitions are getting muddy.

Run - An execution of the simulation. Can be a single generation or multiple.
Strain - Multiple generations of agents which evolved from the same initial set of agents. (“A genetic variant (…) within a biological species.” Species here would be the agent type.)
Population - “A community of [agents] among whose members interbreeding occurs.”

Tasks

Pass a run ID to the evaluate command. Have it save output to a file like build0000/run0000.csv. Right now I think evolution_log.csv is basically what I want.
Write a Python script to pick up the runs from a build folder and generate a chart.
Make the Python script default to the build folder with the highest number.
Make the evaluate command accept the number of generations as a parameter.
Bonus: implement running multiple simulations simultaneously on separate threads. Make the evaluate command accept the number of multi-generation runs as a parameter.

Rubber Ducking #

Rubber Ducking
#