MacOS ssh File System (SSHFS)
The biggest 'I wish I knew this earlier' when doing ssh workflows on small projects
Here's probably the most useful thing that I've learnt this year working with computer clusters remotely. If this sounds obvious to you and you’re already good at this stuff feel free to ignore this post!
If you're like me and a beginner on working on remote HPC clusters with Slurm for computational physics / chemistry, one thing that I naively used to do was
write python scripts, input files and logic on my local computer, using IDEs and tools on my laptop, and then
ssh / rsync them to the remote cluster, and then
do the computation, and then
ssh or rsync them back.
But this is kind of annoying for small projects when you just start out. Especially when you end up writing a bunch of boiler plate code to sync files back and forth - leading to more and more lines of code. And guess what - I spent too much time on that code!
I don't want this to happen because I generally want to delete as much code as possible if the tradeoff between complexity and speed is worth it.
So instead, I realised that I can use macFUSE with sshfs to mount the remote directory via to my local, so that I can just treat it like a directory as any other. There are some fiddly bits like changing your security on permissions on mac M1, but I found the tutorial pretty easy to follow.
Ways I edit files on a cluster :
- Use vscode ssh-tunnel. I don't enjoy editing with vscode but this is imo the smoothest experience. If there is latency between your machine and the cluster, you won't notice it because the editing is actually done to local files, with vscode very regularly syncing those local files with the copies on the cluster
- Run a very lightweight editor directly in the cluster (eg neovim) and just edit directly on the cluster in a single cpu job. For the very rare time I'm editing while not connected to the internet. I'll just git pull locally (as if me and editing locally and me editing on cluster are like two different devs). Requires a good connection to the cluster otherwise it will feel laggy