This is probably gonna be sacrilege to the Pythonians, but I often wish there was support for some SQL-like syntax when working with (pandas) data frames. It certainly would make the process a lot smoother for some tasks.
TileDB[1] offers an embedded SQL experience for python[2]. We use MariaDB built in embedded mode to allow running sql against TileDB Arrays. This can be combined with pandas.read_sql to load the results directly into a pandas array. I wrote this implementation for the embedded SQL, so happy to answer any questions on it. The embedded SQL in its early stages, but should be fully functional for any queries that MariaDB itself supports. TileDB and the embedded SQL are both open source, TileDB is MIT licensed and the embedded SQL is under GPL.
You could try dumping your dataframes to an in-memory SQLite instance using the to_sql method and then running queries on that. Not sure how performant the to_sql bit would be, but I'd expect joins in SQLite to be blazing fast compared to Panda's joins.
My guess is that eval and query have the worst performance, since they need to be interpreted first. As for the other methods, I'm not too sure how they rank. Personally, for style I agree with the first comment on the linked page, which is to split up the conditions for readability:
I strongly prefer .query() for legibility and that it can but used in a pipe. My only problem is that often flake8 will not detect the use of a variable inside of the query string. Has anyone else come across this before?
Original title: "Pandas dataframe filter with Multiple conditions"