RANDOM STUFF - ciemat-tic/codec GitHub Wiki
- reboot a node
scontrol update nodename=x state=down reason=hung
scontrol update nodename=x state=resume
- see why a job is not running
When installing things on non default locations, you have to make sure that they are detected by all users. That is made by
- adding new paths to default one
#/etc/profile
PATH:new_stuff:$PATH
-load it on start, including ssh commands
#.bashrc
if [ -f /etc/profile ]; then
. /etc/profile
fi
Some parameters can be tweaked. It increases overhead so it is not useful for production environments, only for development purposes.
#slurmctld/slurmctld.h
#define PURGE_JOB_INTERVAL 300 <--- set to 10 or so.
This next one can be adjusted in slurm.conf (MinJobAge)or in the source code.
#common/read_config.h
#define DEFAULT_MIN_JOB_AGE 300 <-- Set to 10 or so, too.
Doc says "The minimum age of a completed job before its record is purged from SLURM's active database. Set the values of MaxJobCount and MinJobAge to insure the slurmctld daemon does not exhaust its memory or other resources. The default value is 300 seconds. A value of zero prevents any job record purging. May not exceed 65533"
This is useful to mount CIEMAT shared folder on init.
more /etc/hosts
(..)
172.17.112.14 cendat
more /etc/fstab
(...)
//cendat/u5682 /mnt/samba cifs username=xxx,password=xxx 0 0
It probably requires some software to be installed. I have
sudo apt-get install samba samba-client smbclient
although I suspect most of them are useless
The problem is that two keys are changed.
In CentOS 7 it is solved by modifying
#/usr/share/X11/xkb/symbols/es
xkb_symbols "basic" {
(....)
#comment this line (original one)
// key <TLDE> { [ masculine, ordfeminine, backslash, backslash ] };
#and add these two (replace functionality of previous one, and modify default behaviour of another
key <LSGT> { [ masculine, ordfeminine, backslash, backslash ] }; //esta es la de arriba a la izda, debe ir abajo izda
key <TLDE> {[ less, greater, guillemotleft, guillemotright ]};
-
Recently, we released a larger dataset. It covers a longer period of time (29 days) for a larger cell (about 11k machines) and includes significantly more information, including:
-
The original resource requests, to permit scheduling experiments
- request constraints and machine attriibutes
- machine availability and failure events
- some of the reasons for task exits
- (obfuscated) job and job-submitter names, to help identify repeated or related jobs
- more types of usage information
- CPI (cycles per instruction) and memory traffic for some of the machines
Make sure that path indicated in ExecStart is correct.
Then, slurmd/slurmctl/slurmdbd can be started/stopped/restarted with
service slurmd stop
service slurmd start
service slurmd restart
slurmctld
#systemd/system/slurmctld.service
[Unit]
Description=slurmd Service
After=home.mount
[Service]
Type=simple
User=slurm
ExecStart=/home/localsoft/slurm/sbin/slurmctld -cD
Restart=on-abort
[Install]
WantedBy=multi-user.target
slurmd
#systemd/system/slurmctld.service
[Unit]
Description=slurmd Service
After=home.mount
[Service]
Type=simple
User=slurm
ExecStart=/home/localsoft/slurm/sbin/slurmd -cD
Restart=on-abort
[Install]
WantedBy=multi-user.target
slurmdbd
#systemd/system/slurmdbd.service
[Unit]
Description=slurmd Service
After=home.mount
[Service]
Type=simple
User=slurm
ExecStart=/home/localsoft/slurm/sbin/slurmdbd -cD
Restart=on-abort
[Install]
WantedBy=multi-user.target
Remote Atom can easily be installed using the Atom package manager by going to "Settings > Install" and searching for remote-atom.
On the remote server, we need to install rmate (this one is the bash version). You don't have to install it if you have been using rmate with TextMate or Sublime Text. It is the same executable for TextMate and Sublime Text. If not, it (the bash version) can be installed by running this script (assume that you have the permission),
curl -o /usr/local/bin/rmate https://raw.githubusercontent.com/aurora/rmate/master/rmate
sudo chmod +x /usr/local/bin/rmate
You can also rename the command to atom
mv /usr/local/bin/rmate /usr/local/bin/atom
Open your Atom application, go to the menu Packages -> Remote Atom, and click Start Server. Your can also launch the server via command palette. The server can also be configured to be launched at startup in the preference.
Then, open an ssh connection to the remote server with remote port forwarded. It can be done by
ssh -R 52698:localhost:52698 [email protected]
After running the server, you can just open the file on the remote system by
atom test.txt
If everything has been setup correctly, your should be able to see the opening file in Atom.
It could be tedious to type -R 52698:localhost:52698 everytime you ssh. To make your life easier, add the following to ~/.ssh/config,
Host example.com
RemoteForward 52698 localhost:52698
User user
A simple way to edit and develop code for the virtual cluster is to work in the userspace, mounting the slurm folder path with FUSE.
- First install [Fuse sshfs] (http://fuse.sourceforge.net/sshfs.html) locally
yum install sshfs
To edit files at the slum-master node, mount its slurm-[version] folder
- Create a local mount point
mkdir /mnt/tmpMount
- Assign the mount-point path to the slurm-master (192.168.1.15) folder
sshfs [email protected]:/slurm-[version] /mnt/tmpMount
After finish working, don't forget to unmount it
umount /mnt/tmpMount
In order to have all machines on a cluster syncrhonized, one option is to put a NFS server on the master node and NFS clients on the server. This works even if they do not have internet access, which can be convenient sometimes.
Configuration has been copied from http://www.thegeekstuff.com/2014/06/linux-ntp-server-client/
Master:
#1. Install NTP Server
yum install ntp
#2. Setup Restrict values in ntp.conf. Modify the /etc/ntp.conf file to make sure it has the following two restrict lines.
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
#4. Add Local Clock as Backup
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
#6. Start the NTP Serrver
service ntpd start
Clients:
#7. Modify ntp.conf on NTP Client. Edit your NTP.conf to reflect appropriate entries for your own NTP server.
server 19.168.1.1 prefer
#8. Start the NTP Daemon
/etc/init.d/ntp start
#or
service ntpd start
Sacado de:
https://linuxconfig.org/how-to-install-mpeg-4-aac-decoder-for-centos-7-linux
yum -y install http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-5.el7.nux.noarch.rpm
yum install libdvdcss gstreamer{,1}-plugins-ugly gstreamer-plugins-bad-nonfree gstreamer1-plugins-bad-freeworld libde265 x265
yum install vlc smplayer