prkz.de - Blog


Keep it simple

Jenkins: Using lockable resource across multiple stages

written on 20 May 2021
The Jenkins [Lockable Resources](https://plugins.jenkins.io/lockable-resources/) plugin is nice, but its documentation (as with many Jenkins components) is fairly scarce and often not self-explanatory. The slightly more detailed reference for the lock resource is documented [here](https://www.jenkins.io/doc/pipeline/steps/lockable-resources/). Of course, you can use it in a scripted pipeline or as a step in a declarative pipeline: ```java pipeline { stages { stage('my-stage') { steps { lock('my-resource') { ... } } } } } ``` However, what I wanted, and what is is to let my Jenkins job (a declarative pipeline) acquire a Lock *once* when the whole Job started and release it when all stages are completed (success or fail). Luckily, there's a [this StackOverflow thread](https://stackoverflow.com/questions/44098993/how-to-lock-multiple-stages-of-declarative-jenkins-pipeline), that shows that you can actually use the `lock()` in the `options` section of a Jenkins pipeline: ```java pipeline { options { lock('my-resource') } stages { stage('stage-1') { ... } stage('stage-2') { ... } } } ``` What is nowhere documented is its exact behaviour: **The lock gets acquired *once* for the entire job and is *not* released between stages**. Alternatively, you can use a "parent" stage and acquire the lock for it once: ```java pipeline { stages { stage('my-parent-stage') { options { lock('my-resource') } stages { stage('stage-1') { ... } stage('stage-2') { ... } } } } } ``` In this case, the behavior is immediately obvious, but you need 2 additional indent levels...

Fixing Not Found (404) error for project scans while setting up Jenkins GitLab Branch Source Plugin

written on 16 April 2021
In my case, we had Jenkins access GitLab through a reverse proxy - nginx. Gitlab was running inside a docker container. I could nicely configure then Jenkins job itself with the GitLab Branch Source Plugin, it even nicely auto-discovered the projects for the owner that I set. However, once it tried to do the "Scan GitLab Project Now", it would give me an error saying "Not found (404)" for the project. For debugging, I ran a `tcpdump -A port 80` in the GitLab container, showing an access path looking like this: ```sh https://my-gitlab.com/projects/my-group/my-project/... ``` Clearly, this didn't match the GitLab API URL for projects, which expects an ":id": [https://docs.gitlab.com/ee/api/projects.html](). It took a long while to figure out, that the API component of the Branch Source Plugin actually URL-encodes the "project path" as the id, i.e. the request URL is actually: ```sh https://my-gitlab.com/projects/my-group%2Fmy-project/... ``` **Turns out, you have to tell nginx to use the original URI as-is (and not automatically unwrap/decode the URI...):** ```sh Wrong: proxy_pass http://127.0.0.1:8929/ Correct: proxy_pass http://127.0.0.1:8929/$request_uri ``` More info in this StackOverflow thread: [https://stackoverflow.com/questions/20496963/avoid-nginx-decoding-query-parameters-on-proxy-pass-equivalent-to-allowencodeds]()

Recovering access to a "lost" Apache Kafka Topic entry in Apache Zookeeper

written on 28 September 2020
I recently had the case of having the log segments of a (single-replicated) Kafka Topic still on disk and readable, but I lost all Zookeeper Data so that the topic was not visible to any Kafka client. First, after bringing the Kafka broker up again, i had to manually change the `meta.properties` to include the newly generated `cluster.id`. Now, the topics are stored in zookeeper under the path `/brokers/topics`. I managed to re-create the Node for the "lost" Topic by first creating a dummy topic and copying and adjusting the Zookeeper node contents for that topic. On `zkCli.sh`: ```c create /brokers/topics/MyTopic {"version":2,"partitions":{"0":[1],"1":[1],"2":[1],"3":[1]},"adding_replicas":{},"removing_replicas":{}} ``` You can adjust the `partitions` as necessary for the number of partitions in your original topic data; the `1` for each partition is the broker id of the assigned kafka broker. Obviously, in setups, where not every broker has all partitions replicated, you will have to the configuration as needed. Now, the topic should simply be accessible in Kafka again, with all data: ```sh $ kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic MyTopic MyTopic:0:1011 MyTopic:1:1023 MyTopic:2:1099 MyTopic:3:1002 ```

Using zsh on Debian Windows Linux Subsystem as terminal in VS Code

written on 27 April 2019
You can install the *Windows Subsystem for Linux (WSL)* in the "Turn Windows features on or off" wizard. Afterwards you can install Debian as a regular "app" from the Windows Store. Then i customized my Debian subsystem to use `zsh` and additional tweaks. But I wanted to use that shell directly in VS Code. To do this: In VS Code, go to File > Preferences > Settings and click on the curly braces in the top right corner to open the raw `settings.json`. Then you can configure the internal terminal to point to the debian shell by adding the following (substitute `<USER>` with your username): ```json "terminal.integrated.shell.windows": "C:\\Users\\<USER>\\AppData\\Local\\Microsoft\\WindowsApps\\debian.exe" ``` Now simply open a new terminal and you can use zsh in VS Code!

Fixing TaskManager out-of-memory crashes when running OsmLib in Flink with high parallelism

written on 14 March 2019
My goal was to run [OsmLib](https://github.com/conveyal/osm-lib) inside a high-parallelism Flink job to process geospatial data. OsmLib was first configured with the OSM map data of Europe, then Germany. The server I ran the job on provided 40 logical cores and 128GB of memory. For both maps, the TaskManager crashed after a few seconds trying to deploy the streaming job, while the Europe map crashed with an even lower parallelism than the Germany map. The error message looked like this: > *OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f0676d7c000, 262144, 0) failed; error='Cannot allocate memory' (errno=12)* This led me to think that there actually was not sufficient memory, but `top` showed otherwise. Eventually I found the solution in [this](http://www.mapdb.org/blog/mmap_files_alloc_and_jvm_crash/) Blog post from MapDB - the underlying database framework for OsmLib. Apparently, the TaskManager process hit the limit of `vm.max_map_count`. Running a quick check of `sudo watch -n1 'cat /proc/$TM-PID/maps | wc -l'` (on docker host) while the job was deploying confirmed the problem: The default limit of 65535 mmaps was hit rather quickly. And since each of the OsmLib instances (i.e. parallel instances of the OsmLib operator in Flink) maps the whole OSM database individually, a high parallelism will multiply the number of maps required. To fix the problem, you can set a much higher value for `vm.max_map_count` in `sysctl.conf`. Note that if you're running Flink in Docker containers, this value still has to be set on the Docker host. In my case, i set `vm.max_map_count = 512000` and checked the number of used mmaps again: With a parallelism of 32 and the German OSM map, this count reached over 180k after (finally) sucessfully deploying the job!

Exporting Beam metrics on Flink with JmxReporter and Prometheus JMX Exporter

written on 23 May 2018
Beam allows to define metrics, which are then forwarded to the specific runner (e.g. Flink). The underlying execution engine then handles reporting of those metrics. In my case, I tried to report the metrics with the `GraphiteReporter` first. However Beam includes dots in some parts of the metric name (e.g. the namespace), which makes the Graphite paths generated by Flink almost unusable. Instead, I opted for the `JmxReporter`, which properly handles these cases. As an example, say we create a metric in a Beam DoFn: ```java package org.example; public class MyOperator extends DoFn<...> { private Counter myCounter = Metrics.counter(MyOperator.class, "myCounter"); ... @ProcessElement public void processElement(ProcessContext ctx) { myCounter.inc(); ... } } ``` Now, let's configure the JmxReporter in `flink-conf.yaml`: ```yaml metrics.reporters: jmx metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter metrics.reporter.jmx.port: 8789 ``` To be able to view the exported metrics with jconsole, we have to instruct the JVM to publish the jmxremote service in `flink-conf.yaml`: ```yaml env.java.opts: -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9999 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false ``` Note that when you're running your Flink in a docker container and want to access it from outside the docker host, you also have to specify the **Docker Host** hostname and open port `8789` (configured above): ``` env.java.opts: ... -Djava.rmi.server.hostname=flink ``` We can now examine the exported metrics on the "MBeans" tab in jconsole when connecting to port `8789`. Here is how the domain of a bean could look like that contains our counter metric: ```java org.apache.flink.taskmanager.job.task.operator.__counter__XYZ/ParMultiDo(XYZ)__org.example.MyOperator__myCounter ``` I now wanted to collect these metrics in a Prometheus system. For this, I used the <a href="https://github.com/prometheus/jmx_exporter">Prometheux JMX Exporter</a>. This exporter runs as a javaagent and can be configured with a regex pattern to match beans. Here are the rules that should match all Flink Jobmanager, Taskmanager and Beam operator metrics: ```yaml rules: # Pattern Format: domain<beanpropertyName1=beanPropertyValue1, beanpropertyName2=beanPropertyValue2, ...><key1, key2, ...>attrName: value - pattern: org.apache.flink.taskmanager.job.task.operator.([^:]+)<job_id=(\w+), job_name=([^,]+), tm_id=(\w+), task_attempt_id=(\w+), task_attempt_num=(\d+), subtask_index=(\d+), task_id=(\w+), operator_name=([^,]+), task_name=([^,]+), operator_id=(\w+), host=(\w+)><>(Count|Value) name: flink_operator_metric labels: metric: $1 ... (add whatever you need) ... - pattern: org.apache.flink.taskmanager.(\w+)<host=(\w+), tm_id=(\w+)><>(Count|Value) name: flink_taskmanager_metric labels: host: $2 tm_id: $3 metric: $1 - pattern: org.apache.flink.jobmanager.(\w+)<host=(\w+)><>(Count|Value) name: flink_jobmanager_metric labels: host: $2 metric: $1 ``` To run the JMX Reporter agent with the Flink JVM, add this option to your `env.java.opts` in `flink-conf.yaml`: ```java -javaagent:/path/to/jmx_prometheus_javaagent.jar=8097:/path/to/jmx_config.yaml ``` All that's left is to add a prometheus scrape target: ```yaml scrape_configs: - job_name: flink static_configs: - target: ['flink:8097'] ``` You can now query the operator metrics, e.g. in Grafana: ```java flink_operator_metrics{metric=~".*MyOperator__myCounter"} ```

Statically setting Flink JobManager port inside a YARN cluster

written on 12 April 2018
Recently, I tried to start a long running Flink Yarn session (via `yarn-session.sh` script) and automatically deploy a job to it. However, by default, the YARN ApplicationMaster (which is the same process as the JobManager) will be listening on a random port. This is not useful for automatic job deployment. This may also not be useful in scenarios where you are running your YARN cluster behind firewall and don't want to open up a large range of ports. The same holds for docker environments that require to specify published ports. To fix this, you can make the ApplicationMaster listen on a specific port or port range by setting `yarn.application-master.port` property when starting your yarn session: ``` bin/yarn-session.sh -Dyarn.application-master.port=9123 -n 2 -jm 1024 -tm 4096 ``` Now the `jobmanager.rpc.port` will be reported as 9123 in the Flink dashboard and you can access the jobmanager at this port. ### Automatically submitting a Flink job to the running cluster. The problem with the solution above is that in a distributed environment, we still don't know on which of the nodes the JobManager is running. However, we don't need this information as flink stores necessary YARN connection information in a temporary file on the host that you ran `yarn-session.sh` on. By default this file is located in the `/tmp` directory, but you can change this path with the `yarn.properties-file.location` Flink property. In my case, I had the session running in a separate docker container. I can now submit the application through this container: ```java docker exec -ti /flink/bin/flink run /flink/examples/batch/WordCount.jar ``` Alternatively, if you want to run the job client in a separate container as well, you could create a shared volume between those containers and set the `yarn.properties-file.location` property described above.

Connecting from windows to debian server with a remote desktop using x2go

written on 4 March 2018
1. Install openbox or any other desktop environment 2. Install x2go server on the remote machine (change distribution "stretch" as needed): ``` $ sudo apt-key adv --recv-keys --keyserver keys.gnupg.net E1F958385BFE2B6E $ echo 'deb http://packages.x2go.org/debian stretch main' | sudo tee /etc/apt/sources.list.d/x2go.list $ sudo apt update && sudo apt install x2goserver x2goserver-xsession ``` 4. There is a windows client for x2go ([https://code.x2go.org/releases/](https://code.x2go.org/releases/))

Making VSCode work on a debian machine connected to via x2go

written on 4 March 2018
1. Install vscode (get deb package, install it and run `apt install -f`) 2. Install possibly missing dependencies on a headless server: ```bash apt install libgtk2.0-0 libxss-dev libgconf-2-4 libasound2 ``` 3. Fix vscode not starting up in x2go. Add this line to `/etc/x2go/x2goagent.options`: ```bash X2GO_NXAGENT_DEFAULT_OPTIONS+=" -extension BIG-REQUESTS" ``` 4. Restart x2go server and reconnect from x2go client

Moving debian linux swap partition to the start of the disk

written on 22 November 2017
When I installed my virtual machine with debian, the setup created a swap partition (/dev/sda5) right after the actual partition (/dev/sda1). However, this makes resizing the main partition difficult, so I wanted to move the swap partition to the start of the disk: 1. Using GParted live: 1. Remove extended partition 2. Move /dev/sda1 to the right 3. Recreate extended partition and swap in empty space at start of disk 2. During the first boot, it will try to find the swap disk with a timeout of 1min 30sec 3. Recreate swap fs: ``` mkswap /dev/sda5 swapon /dev/sda5 ``` 4. Determine UUID of sda5 by looking at `ls -al /dev/disk/by-uuid` 5. Update swap UUIDs in `/etc/fstab` and `/etc/initramfs-tools/conf.d/resume` 6. Run `update-initramfs -u` 7. Reboot (Tested with Debian Stretch)

Make docker-machine docker host use a static ip address

written on 19 October 2017
Boot2docker runs the `/var/lib/boot2docker/bootsync.sh` script when booting. You can use this file to setup a static ip address and default gateway during boot: ```bash $ cat <<EOF | docker-machine ssh docker-host sudo tee /var/lib/boot2docker/bootsync.sh > /dev/null ifconfig eth0 10.0.0.10 netmask 255.255.255.0 broadcast 10.0.0.255 up ip route add default via 10.0.0.1 EOF ``` After modifying this file, restart the host, verify its settings and regenerate client-certificates ```bash $ docker-machine restart docker-host $ docker-machine env docker-host $ docker-machine regenerate-certs docker-host ```

Automatically loading PowerCLI module when loading PowerShell ISE

written on 15 October 2017
Powershell uses several "profile" scripts that are comparable to a `.bashrc`. You can determine the currently loaded profile by examining the `$profile` variable: ```powershell > $profile C:\Users\user\Documents\WindowsPowerShell\Microsoft.PowerShellISE_profile.ps1 ``` You will notice, that this ISE profile is different from that used in PowerCLI. However, you can create this file and use it to import the VMware PowerCLI module when ISE starts: ```powershell # Load PowerCLI commandlets for managing vSphere Import-Module VMware.VimAutomation.Core -ErrorAction SilentlyContinue ``` After restarting PowerShell ISE, you should now be able to use commandlets like `Connect-VIServer` or `Get-VM`

[MySQL] Swapping ids of two Doctrine ORM entities with foreign key constraints

written on 9 September 2017
I recently had to swap the ids of two Doctrine ORM entities (pages), that were referenced by other entities (articles). Usually you don't want to do this at all. Even copying the entity contents over would be cleaner. However, I had to achieve the result with only MySQL. You can first disable the check for foreign constraints check (and re-enable it afterwards!). Then, swap the ids of the entities using a temporary id that is >= AUTO_INCREMENT to avoid UNIQUE constraint error. Afterwards, fix the references with a single UPDATE call using the SWITCH statement. The following code swaps the ids of two pages #12 and #13 and updates references in articles: ```sql SET foreign_key_checks = 0; -- Swap ids, where 999999 is >= current AUTO_INCREMENT UPDATE pages SET id=999999 WHERE id=12; UPDATE pages SET id=12 WHERE id=13; UPDATE pages SET id=13 WHERE id=999999; -- Updates references in articles with single UPDATE UPDATE articles SET page_id = CASE page_id WHEN 12 THEN 13 WHEN 13 THEN 12 END WHERE page_id IN (12,13); SET foreign_key_checks = 1; ```

Fixing Windows not booting (blinking cursor) without second drive

written on 3 June 2017
Had a simple Windows setup where Windows was installed on the SSD + extra Data HDD. I recently took the HDD out of this system and it didn't boot anymore. Just a blinking cursor. Start repair from windows setup couldn't fix the problem. I then went ahead and fixed it manually by finding and removing a property called `resumeobject` in the BCD (Windows Boot Configuration Data) entry. I suspected this object to refer to a non-existent file. To delete this `resumeobject` entry: 1. Boot from Windows install medium, go into command line 2. You can now list your entries in the BCD: ```powershell bcdedit /enum ``` 3. To delete the `resumeobject` property: ```powershell bcdedit /deletevalue {default} resumeobject ``` Here, `{default}` refers to the respective entry in the list you got above. Note that the boot manager files may have been also installed on the second drive for some reason. You can recreate them on the main drive (with the Windows files on it), say the `C:` drive: ```powershell bcdboot C:\Windows /s C: /f ALL ``` Here, `/f ALL` specifies that all firmware types are supported. Check out `bcdboot /?` for all possible options.

Linearizing depth buffer samples in HLSL

written on 31 March 2017
In Direct3D, we can can create a shader resource view (SRV) for our hardware depth buffer to sample it in the shader. Usually you will use something like `DXGI_FORMAT_D32_FLOAT` for the depth stencil view and `DXGI_FORMAT_R32_FLOAT` for the SRV. However, this means that we get non-linear depth when sampling in the shader: ``` // The projection done in the vertex shader and perspective divide float4 clipCoords = mul(mtxProj, mul(mtxView, worldCoords)); clipCoords /= clipCoords.w; // [-1,1] -> [0,1] float2 normClipCoords = clipCoords.xy * 0.5f + 0.5f; // Sample the raw depth buffer float nonLinearDepth = depthBuffer.Sample(pointSampler, normClipCoords).r; ``` The sampled depth is in range `[0,1]` where 0 represents the near- and 1 the far-clip-plane. Additionally, the distribution is not linear, but changes in depth close to the near-clip-plane have a higher resolution than changes far away from the camera. **To linearize the sampled depth-buffer value**, we can multiply the native device coordinates (ndc) vector by the inverse projection matrix and divide the result by the w coordinate (as the result is a homogenous vector). ``` // We are only interested in the depth here float4 ndcCoords = float4(0, 0, nonLinearDepth, 1.0f); // Unproject the vector into (homogenous) view-space vector float4 viewCoords = mul(mtxProjInv, ndcCoords); // Divide by w, which results in actual view-space z value float linearDepth = viewCoords.z / viewCoords.w; ``` Note that based on your projection matrix, you may have to negate the resulting depth, as it may point into the camera.

Empty WSDL output of a WCF Service and response 400 Bad Request

written on 18 January 2017
I was running into this problem with a .NET WCF service (hosted with Windows Service Host) that had a custom binding endpoint and a host baseAddress configured in the app.config. The response to a GET request on `http://localhost:9001/myService/?wsdl` was a `400 - Bad Request`. In the Service Trace it said: > *"There is a problem with the XML that was received from the network."* and > *"The body of the message cannot be read because it is empty"* The reason was that I forgot a trailing slash at the end of the base address: ```xml <endpoint contract="..." binding="..." bindingConfiguration="..." /> <host> <baseAddresses> <add baseAddress="http://localhost:9001/myService" /> <!-- Wrong --> <add baseAddress="http://localhost:9001/myService/" /> <!-- Correct --> </baseAddresses> </host> ```


Delete dangling IPv6 address from VMnet8 adapter on windows

written on 28 December 2016
After changing the IPv6 prefix in the virtual network manager in VMWare Workstation, I ended up with a dangling ip address and I wondered how I could get rid of it. When trying `netsh delete address "VMware Network Adapter VMnet8" fd20:XXXXXX` it told me that the command was not found. **It turns out, you have to explicitly tell netsh that you're talking about an ipv6 address:** ```powershell netsh int ipv6 delete address "VMware Network Adapter VMnet8" fd20:XXXXXXXX ``` Remember to run this on a cmd as an Administrator, otherwise you get a confusing error saying *"The system cannot find the file specified"*.

[Linux Teamspeak3 server] Using the serverquery to disable weblist or recover administrative access

written on 5 November 2016
The teamspeak server query over telnet uses an unencrypted channel to transmit the password. However, you can set a temporary serveradmin password to (e.g.) disable weblist of your virtual server: 1. Stop the server and inside your teamspeak server directory, run the server temporarily with a fixed serveradmin password: ```bash ./ts3server_minimal_runscript.sh serveradmin_password=PASSWORD ``` 2. Connect via telnet to your server on port 10011 (default) and login with the temporary password: ```bash login serveradmin PASSWORD ``` 3. To disable weblist on server 1: ```bash use sid=1 serveredit virtualserver_weblist_enabled=0 ``` 4. To generate a new access token: 1. Connect via teamspeak client and find out the number in parantheses behind the group name in `Permissions > Servergroups` 2. Create a new token via serverquery: ```bash tokenadd tokentype=0 tokenid1=<Group ID> tokenid2=0 ``` 5. Close telnet session, stop `ts3server_minimumal_runscript.sh` and run it again with a **random** PASSWORD 6. Stop the server and start the production server again 7. Clear out shell history entries containing the new random password

Starting with C# WCF SOAP services and EntityFramework: Lessons learned

written on 5 November 2016
I just started to develop a self-hosting WCF SOAP service using EntityFramework for serialization (with a MSSQL database backend). This service will later be accessed by a PHP SoapClient. Let me summarize some experiences here: * You can create C# WCF contracts from a WSDL by using: `svcutil your.wsdl` * Services are instanced for each request by default. You can change this by using following attribute on you service implementation: * `[ServiceBehavior(InstanceContextMode = InstanceContextMode.Single)]` * This allows you to pass parameters to your service * To allow non-Administrator users to run the service, execute this on the cmd as an administrator (you can use `+` as your address to represent localhost): ```powershell netsh http add urlacl http://your-address:PORT/ user=USERNAME ``` * WCF requires a getter AND setter for all properties (you can make them protected though): * otherwise the service host answers the SOAP request with a RST packet, resulting in a error saying: `[HTTP] Error Fetching http headers` * To trace WCF events: <a href="https://msdn.microsoft.com/en-us/library/ms733025(v=vs.110).aspx" target="_blank">See MSDN</a> * EntityFramework (EF) can be installed in the NuGet packet manager and will be automatically installed when running your project on a different machine * Unfortunately, EntityFramework does not yet support an InMemory database for testing in the current stable version (EF6) * However, it will be supported in EF7