Conversation
a-r-j
left a comment
There was a problem hiding this comment.
Nice work! Looks really great. Only a couple minor points from me :)
biopandas/pdb/pandas_pdb.py
Outdated
| return output | ||
|
|
||
| @staticmethod | ||
| def gyradius(df: pd.DataFrame) -> float: |
There was a problem hiding this comment.
Could this default to self.df["ATOM"]?
ppdb.gyradius(ppdb.df["ATOM"] seems a little clunky
There was a problem hiding this comment.
E.g.
if df is None: df = self.df["ATOM"]
There was a problem hiding this comment.
Yeah, I thought that it might be better, I just imitated the usage of rmsd and made it a staticmethod as suggested in issue #9, but yes I can fix that
There was a problem hiding this comment.
Should I just remove the df parameter altogether and set df = self.df['ATOM']? Because with anything else other than 'ATOM' it won't work. 'ANISOU' and 'OTHERS' don't have coordinates, and 'HETATM' has atoms that are most likely not present in atomic_masses, and I don't know if anyone would even want to use the function for 'HETATM'. It could be made as a parameter that defaults to 'ATOM', but again, anything else doesn't really make sense
There was a problem hiding this comment.
So I think the way I would do it is to implement a standalone function and then have a method in the class that wraps that and calls it on self.df["ATOM"]. Having said that, I can imagine an edge case scenario where someone wants to compute gyradius on HETATMs (they're not always strictly non-protein atoms in pactice!)
There was a problem hiding this comment.
So I eventually added a records parameter so now you can specify if you want the rg for the "ATOM" record or "HETATM" or both. Not sure if atoms in HETATM would be relevant, but if they are for someone, more atom masses can be added to atomic_masses. There's also a decimals parameter that controls the rounding which is a a bit useless since anyone could probably use round(), but the final value had 14 decimals, so I thought that 4 is a good default for rounding and gives the option to change it.
There was a problem hiding this comment.
Awesome, final suggestion then is to move the atomic masses into constants.py (see #119 ). That way it's easy to accomodate hetatms like you suggest.
I.e.
import biopandas
biopandas.ATOMIC_MASSES = {"Si": 100, "Mg: 200"}
# Compute gyraidus...There was a problem hiding this comment.
How do I add them to constats.py considering that you've added the file in your recent pull request? Do I have to clone your main fork and then make a PR, and once merged the changes will be included in your current PR to biopandas?
There was a problem hiding this comment.
If you just make the file on this branch I think it should merge automatically. Otherwise I'll resolve the conflict on my branch
biopandas/pdb/pandas_pdb.py
Outdated
| center_of_mass = (masses[:, None] * coords).sum(axis=0) / total_mass | ||
| distances = np.linalg.norm(coords - center_of_mass, axis=1) | ||
| rg = np.sqrt((distances ** 2 * masses).sum() / total_mass) | ||
| return round(rg, 4) |
There was a problem hiding this comment.
I'm not sure we make a presumption about the accuracy people may want this at.
There was a problem hiding this comment.
So I could add an extra argument that defaults at let's say 3 or 4? Again, looked at rmsd which rounds to 4, I guess I could add the extra argument for it too.
docs/CHANGELOG.md
Outdated
| [https://github.com/rasbt/biopandas/blob/main/docs/sources/CHANGELOG.md](https://github.com/rasbt/biopandas/blob/main/docs/sources/CHANGELOG.md). | ||
|
|
||
|
|
||
| ### 0.5.2 (27-03-2023) |
There was a problem hiding this comment.
0.4.1 is still the latest release so these recent changes should actually go under 0.5.0dev
There was a problem hiding this comment.
Ok yeah mb I'll change that
Code of Conduct
Description
Added a new method
gyradiusinPandasPdbthat computes the radius of gyration of a molecule, along with some tests for itRelated issues or pull requests
Fixes #9
Pull Request Checklist
./docs/sources/CHANGELOG.mdfile (if applicable)./biopandas/*/testsdirectories (if applicable)biopandas/docs/sources/(if applicable)PYTHONPATH='.' pytest ./biopandas -svand make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g.,PYTHONPATH='.' pytest ./biopandas/classifier/tests/test_stacking_cv_classifier.py -sv)flake8 ./biopandas