Commit Graph

56 Commits

Author SHA1 Message Date
7e767c3716 memory / cpu ressources fix 2021-06-25 12:19:00 +02:00
a61d08d118 restructured playbooks, cleanup 2021-06-25 01:55:14 +02:00
188a9215a9 tags #2 2021-06-24 16:37:46 +02:00
9499ce49ae fix: wrong network 2021-06-24 16:37:10 +02:00
9237d736d8 tags 2021-06-24 14:17:16 +02:00
e979ea4d6e fix hostname of cobald slurm node
made cobald be able to run slurm jobs, previously failing with
permission denied.
2021-06-24 14:07:35 +02:00
c7e931f29e fix: building base image -> update child images 2021-06-23 14:29:32 +02:00
a73f9ad6ad additional user in slurm base docker image 2021-06-23 14:28:50 +02:00
c35dc25c39 labels, some cleanup 2021-06-22 19:09:52 +02:00
1f4dfe1821 build cobald image from slurm role, separated tags 2021-06-22 16:48:56 +02:00
78850d4636 merged slurm_dockerimage back into slurm role 2021-06-22 00:26:00 +02:00
f83801cb62 removed cobald_facts module 2021-06-21 21:34:24 +02:00
e78e184375 WIP: cobald container containing and using slurm 2021-06-21 19:19:19 +02:00
02e87d7c40 cleanup, requisite files instead of startupscripts 2021-06-18 12:03:14 +02:00
4450c9bb65 WIP: separate slurm base and docker images 2021-06-17 22:50:40 +02:00
6eb6984d6a new startup for cobald containers 2021-06-17 14:55:34 +02:00
cc43a39ea3 dashboard revision 2021-06-14 10:43:47 +02:00
962d9b5ac9 grafana dashboard updated, wait_for 2021-06-10 10:51:01 +02:00
e81fb5d445 cobald container termination signal 2021-06-09 16:26:14 +02:00
73945b6cb9 shorter hostname for cobald container 2021-06-08 16:07:54 +02:00
089ea914b6 updated dashboards 2021-06-08 12:31:46 +02:00
dd1baa4aef grafana 2021-06-08 12:31:13 +02:00
ea3195a93c minor fixes (entrypoint) and restructuring 2021-06-08 12:28:09 +02:00
aef1499e65 fixed influxdb when container absent and wait_for 2021-06-02 17:04:32 +02:00
c7203f58ff fix: influxdb connection issue 2021-06-01 19:17:04 +02:00
2e0d83cca1 host ed-c7-2, fixed htop install 2021-06-01 18:30:18 +02:00
35882ca1a9 TODO for influx modules 2021-05-25 23:56:19 +02:00
4e7f33338e telegraf + influxdb 2021-05-25 23:47:03 +02:00
ddc6c2bb4d influx modules: fixes, permission match, py2, args 2021-05-25 19:13:57 +02:00
f9e29a4e30 influx bucket module 2021-05-25 15:00:47 +02:00
c26e962898 token module improved 2021-05-25 11:55:30 +02:00
38c117d6fa influxdb2 plugins 2021-05-21 20:28:50 +02:00
ecb9724ee3 generic unpriv_user 2021-05-11 23:56:29 +02:00
4373e0a4a2 cobald development environment 2021-05-11 23:48:49 +02:00
19b71c9933 first cobald tardis 2021-05-10 12:20:27 +02:00
fdd4bd6bf0 copy plugin just for 2.9 2021-04-30 17:47:54 +02:00
f7dd3bcf02 run slurmctld as user
Notice: also trying to run slurmd on execute nodes as user makes no
sense because it breaks sbatch. Furthermore there is another necessary
to run mpi jobs (just tried MpiDefault=none). I don't consider running
slurmd as root a good idea, but there seems to be no other choice at the
moment.
2021-04-30 17:15:57 +02:00
f2cb9b2c6b fixed log (includes log output from tasks now) 2021-04-30 16:47:31 +02:00
38a5b89de9 minor fixes 2021-04-29 12:19:33 +02:00
bc27dec00e shared volume 2021-04-27 22:36:06 +02:00
aa0fe4d039 variable username 2021-04-27 15:00:12 +02:00
d25c2f7a15 README, minor structure improvements 2021-04-27 14:32:25 +02:00
89423edf25 fully variable execute nodes 2021-04-27 14:18:14 +02:00
077988e03d privileged container switch 2021-04-26 18:00:40 +02:00
00c2a2a817 config 2021-04-26 17:21:35 +02:00
4586fa7092 slurm startup 2021-04-26 17:20:54 +02:00
53502213bc WIP: slurm 2021-04-24 00:13:57 +02:00
af71f7e983 removed minicondor configuration 2021-04-22 21:48:32 +02:00
ddf10eb2fa converted docker_htcondor to role 2021-04-22 19:08:04 +02:00
2767014223 htcondor role configuration 2021-04-22 18:48:59 +02:00