Technically yes. But it can take months to years to experimentally obtain the structure for a single protein, and that assumes that it's possible to crystallize (X-ray), prepare grids (cryo-EM) or highly concentrate (NMR) the protein at all.
On the other hand, validating a predicted protein structure to a good level of accuracy is much easier (solvent accessibility, mutagenesis, etc.). So having a complex model that can be trained on a small dataset drastically expands the set of accurate protein structure samples available to future models, both through direct predictions and validated protein structures.
So technically yes, this dataset could have been collected solely experimentally, but in practice, AlphaFold is now part of the experimental process. Without it, the world would have less protein structure data, in terms of both directly predicted and experimentally verified protein structures
On my Framework (16), I've found that switching to GNOME's "Power Saver" mode strikes the right balance between thermals, battery usage and performance. I would recommend trying it. If you're not using GNOME, manually modifying `amd_pstate` and `amd_pstate_epp` (either via kernel boot parameters or runtime sysfs parameters) might help out.
I agree that it's unfortunate that the power usage isn't better tuned out of the box. An especially annoying aspect of GNOME's "Power Saver" mode is that it disables automatic software updates, so you can't have both automatic updates and efficient power usage at the same time (AFAIK)
Typer has a great feature that lets you optionally accept argument and flag values from environment variables by providing the environment variable name:
No, that's an anti-feature. :) Sibling comments here claim that command line arguments "leak" whereas environment variables does not. It's plain wrong. An attacker with access to arbitrary processes' cmdline surely also has access to their environ. Store secrets in files, not in the environment. Now you can easily change secret by pointing the --secret-file parameter to a different file. The only reason people use BLABLA_API_KEY variables is because Heroku or something did it back in the day and everyone cargo-culted this terrible pattern.
One could write a huge treatise on everything that is wrong with environment variables. Avoid them like the plague. They are a huge usability PITA.
This is bad advice. Please don't make claims about security if you're making it up as you go.
Environment variables are substantially more secure than plain text files because they are not persistent. There are utilities for entering secrets into them without leaking them into your shell history.
That said, you generally should not use an environment variable either. You should use a secure temporary file created by your shell and pass the associated file descriptor. Most shells make such functionality available but the details differ (ie there is no fully portable approach AFAIK).
The other situation that sometimes comes up is that you are okay having the secret on disk in plain text but you don't want it inadvertently commited to a repository. In those cases it makes sense to either do as you suggested and have a dedicated file, or alternatively to set an environment variable from ~/.bashrc or similar.
One of my favorite features of loguru is the ability to contextualize logging messages using a context manager.
with logger.contextualize(user_name=user_name):
handle_request(...)
Within the context manager, all loguru logging messages include the contextual information. This is especially nice when used in combination with loguru's native JSON formatting support and a JSON-compatible log archiving system.
One downside of loguru is that it doesn't include messages from Python's standard logging system, but it can be integrated as a logging handler for the standard logging system. At the company where I work we've created a helper similar to Datadog's ddtrace-run executable to automatically set up these integrations and log message formatting with loguru
Does loguru come with a lot of global state like the standard logging library? In my experience this makes logging a real PITA when forking multiple processes…
My usual workflow to rebase a stacked set of feature branches off of main is:
git checkout main
git pull
git checkout feature-branch-1
git rebase -i -
git checkout feature-branch-2
git rebase -i -
...
There's most likely a more efficient way to do this, but I find that referring to the most recently checked-out branch with a single hyphen is a lesser-known feature, so perhaps it will be helpful :)
Nice, didn't know about `-`. What's your process once you're in the interactive rebase? The `git rebase --onto` I use is noninteractive and could be an alias, were it not for the business of finding the `branchpoint` commit.
I've always had a hard time understanding consistent hashing, and find rendezvous hashing [1] to be much more understandable. It provides better load-balancing properties and uses less memory than consistent hashing, but requires more computation.
It seems like consistent hashing is much more popular though, which definitely makes it worthwhile to learn.
I think jump-based consistent hashing also requires less memory, O(1), than rendezvous hashing.
From my limited perspective and the paper linked in the article it sounds like consistent hashing is best for numbered sharding (disk storage systems and databases) and rendezvous hashing is best for arbitrarily distributed storage where nodes can't be consecutively numbered.
My best attempt at explaining jump consistent hashing is that it's possible to determine how likely a given key will be to move to a nearby bucket (small hash values make it less likely, large hash values make it more likely) and use that likelihood to choose a next bucket candidate for each key. About half of keys are likely to move from 1 bucket to 2, but only a third are likely to move from 2 buckets to 3, etc. and in general 1/n of keys are likely to move to bucket n.
Solvvy is a next-gen chatbot and customer support automation platform built with AI, ML, and Natural Language Processing technology.
Solvvy is looking for a Pythoneer passionate about applying cutting-edge developments in the Python programming language and ecosystem to build amazing support experiences. You'll be responsible for coaching other team members to use advanced Python features effectively, designing system architectures for new features and products and building backend APIs for machine learning and data analytics services.
You'll be tasked with challenging technical problems and be responsible for developing innovative solutions. You'll work closely with a team of high-contributing engineers and product managers to solve user needs and work together to ensure that end-users enjoy a truly delightful support experience.
The best you can do (without doing crazy introspection) is to set `__all__` in `mymod`. It still requires listing everything you want to be exported from `mymod`, but at least you only need to do it once.
I also struggle with Mypy's strict mode, but I think recursive types are different from classes that can construct each other. Mypy doesn't have any problem with the following code
from __future__ import annotations
class X:
def __init__(self, value: int) -> None:
self.value = value
def to_y(self) -> Y:
return Y(self.value)
class Y:
def __init__(self, value: int) -> None:
self.value = value
def to_x(self) -> X:
return X(self.value)
x = X(10).to_y().to_x()
print(x.value)
Sorry, yes, that works. I should have been a little more explicit. This breaks down when Y is generic over X and X is generic over Y. IDK, I've had tons of problems with mutually recursive TypeVars, though my example was not of that - sorry.
On the other hand, validating a predicted protein structure to a good level of accuracy is much easier (solvent accessibility, mutagenesis, etc.). So having a complex model that can be trained on a small dataset drastically expands the set of accurate protein structure samples available to future models, both through direct predictions and validated protein structures.
So technically yes, this dataset could have been collected solely experimentally, but in practice, AlphaFold is now part of the experimental process. Without it, the world would have less protein structure data, in terms of both directly predicted and experimentally verified protein structures
reply