12 Nov 2015

My AGI journey


Over the next year (and hopefully years), I plan on working on AI, but more particularly what is known as AGI, Artificial General Intelligence.

Since I am the kind of person that enjoys to overly plan, I've been thinking about the whole process for a while. Here's a brief overview of how I plan to construct my days as well as my work.

  • Decide and plan on which projects I will work today
  • Video log
    • What will I be doing today?
    • What might be blocking me or will be difficult and how do I plan to tackle that?

  • Record my thoughts: Either through video logs or through written notes, which is the more likely option. This will allow me to search my thoughts as well as organize them as necessary. Furthermore, I hope I'll be able to optimize my thought flow through tools.

  • Video log
    • What have I done during the day?
    • What are the key takeaway of the day?
    • What didn't I do?

  • Take notes (of interesting sections and ideas)
  • Write down thoughts
  • Write down questions related to what is being read (for further exploration)
  • Explore right away a question I might have had

I plan on working on multiple mini projects in order to ensure diversity as well as to allow me to spot projects which have the potential to be interesting and rewarding. However all projects are valuable in their own and it is important to reflect on the project at the end in order to extract things we'll want to repeat in the future as well as things we'll want to avoid.

  • Post mortem
    • What went wrong
    • What went right

  • Update the status of all tracked activities
  • Write a list of things that were done during the month regarding each activity that was done. The purpose is to review what was done as well as provide a way to determine how well a project may/may not be progressing. It is also a good time for me to look at each individual activity that was done during the month and evaluate if I want to keep doing it or not.
  • Write a post-mortem of the month, describing the good/bad of my current process and progress, with a section suggesting improvements to try for the next few months.
  • Plan the projects/activities I will be working on next month, as well as their time allocation.

I will be experimenting with this process in the next few weeks and will iterate on it as I see fit. I will thus be updating this post as changes occurs and I think of better ways to do my work.

In this article, we'll go over how to setup Jenkins on an Ubuntu machine to run PHP 7.1 jobs. The steps should easily be adapted for any other OS and target environment.

  • Docker Slaves Plugin

  • In a Dockerfile
FROM ubuntu:xenial

COPY sources.list /etc/apt/sources.list
RUN useradd -m --uid=1001 jenkins
COPY known_hosts /home/jenkins/.ssh/known_hosts
RUN chown jenkins:jenkins -R /home/jenkins/.ssh

RUN apt-get update
RUN DEBIAN_FRONTEND="noninteractive" apt-get install -y git vim curl wget build-essential python-software-properties software-properties-common unzip

RUN add-apt-repository -y ppa:ondrej/php
RUN apt-get update
RUN DEBIAN_FRONTEND="noninteractive" apt-get install -y --force-yes php7.1 php7.1-xml php7.1-mbstring php7.1-zip php7.1-pdo-mysql php7.1-pdo-sqlite

RUN apt-get -y autoremove && apt-get clean && apt-get autoclean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
  • docker build -t your/image:1.0.0 -t your/image:latest .

  • your/image (will use your/image:latest)

This is particularly useful if you need to pull git repositories from a private repository.

As of December 2016, if you want to be able to use the SSH key in a docker container, you have to first start the ssh agent on the node used that will run the docker container and when you run the docker image, pass in the SSH_AUTH_SOCK as a volume so that it is shared with its host.

sshagent(['IDENTIFIER']) {
    docker.image('your/jenkins-slave-image').inside('--volume $SSH_AUTH_SOCK:$SSH_AUTH_SOCK') {
        // Here your SSH_AUTH_SOCK is shared with the host machine, which may be a jenkins slave
        // All commands that would use SSH for authentication (such as git or composer when installing from private repositories) should work

node('docker') {
    catchError {
        timestamps {
            wrap([$class: 'AnsiColorBuildWrapper']) {
                sshagent(['private-git']) {
                    docker.image('tomzx/jenkins-slave').inside('--volume $SSH_AUTH_SOCK:$SSH_AUTH_SOCK') {
                        stage 'Checkout'
                            checkout scm

                        stage 'Setup dependencies'
                            sh 'wget -nc https://getcomposer.org/composer.phar'
                            sh 'php composer.phar install'

                        stage 'Test'
                            sh 'vendor/bin/phpunit'

06 Nov 2015

Fitness trackers


I've recently become interested in the Quantified Self and wanted to compare a few trackers in order to determine the quality of their solution.

In this study, I compare the Fitbit Charge HR, the Jawbone UP3 and the RS300X. The Fitbit and the Jawbone are known as activity/fitness trackers while the RS300X is an heart rate tracker.

Fitbit and Jawbone are known fitness brand trackers. It's been a couple of years now that these types of trackers are popular, so I'd expect them to have come to some level of maturity. The goal of this study is to evaluate the quality of the products, as much in term of hardware as software.

In my case, I will be testing the Android applications since I own a Nexus 5. I would expect the experience to be quite similar on iOS.

I've also bought an RS300X in order to test a live heart rate tracker.

  • Has a little OLED display.
  • UI/UX is pretty straightforward.

  • The bracelet broke about 3 months after I purchased it. Fitbit support was excellent though and I received a replacement bracelet within 2 days.
  • Website errored out when I tried to create an application (preventing people from creating apps?)
  • Cannot get fine grained data.
  • Heart rate day availability is very odd. Seems to lag out and may display only in chunks.
  • Shows up in my android smart lock as a bluetooth device I could pair with, but I can't select it.

  • None at the moment.

  • The bracelet broke about 4 months after I purchased it.
  • Cannot get fine grained data.
  • Doesn't do real time heart rate tracking.
  • Heart rate day availability is very odd. Seems to lag out and may display only in chunks.
  • No idea why they decided to go with some weird buckle design. It looks like it's a nice point of failure.
  • After about a month it feels like the buckle is becoming loose. It is more and more frequent that it becomes undone and comes close to falling and I have to attach it back...
  • After about two months the buckle is becoming loose very frequently. I'd say I have to "re-buckle" it at least 10-20 times per day. That is ignoring the fact that it'll unbuckle while I sleep, making it pretty useless to sleep heart rate while sleeping.
  • The four heart rate sensors on the lower part of the bracelet are uncomfortable.
  • All the people I've shown the device to asked "why is there no display, show at least the time...".
  • Terribly clunky UI/UX. Why can't I just swipe from day to day in any of the stats
  • For that matter, why is it so hard to display the metrics I'm interested in and get rid of all those "pretty" suggestions cards?
  • Why would you put unit selection (metric or imperial) as being configured through my height or weight? Is it to save me time or to confuse me?

The following chart is the data I've collected over 3 days. In black is the Jawbone UP3 data and in blue the Fitbit Charge HR data.

Overall, the Fitbit Charge HR data is more consistent (every 5 minutes) while the Jawbone UP3 may end up having no data point for up to 2 hours. I assume this might be caused by how I've been wearing the bracelet but that is a weak argument.

Accuracy between the two devices can vary a lot.

[Pedometer chart for comparison]

Both devices seem to measure approximately the same amount of sleep. However the Jawbone UP3 has a nicer chart that goes into the various stages of sleep while the Fitbit Charge HR only displays Asleep/Restless/Awake.

[Sleep monitoring chart for comparison]

  • None at the moment.

  • Cannot get data out of device (for free).
  • "If you want your data, pay us another 70\$ +taxes for a transmission device".
  • "Oh, and by the way, you're going to send us those precious data over our web service so we can mine it for \$".
  • Watch freaks out if too close from chest strap.
  • Who thought of this terrible buckle design? If I want to break the strap, that's how I'd design the buckle.

  • PHP
  • MySQL
  • Jenkins
  • Apache/Nginx
  • Linux (Ubuntu)
  • node.js/io.js

My goal with this post (and any subsequent posts) is to share my thoughts and current practices on the topic of developing PHP applications in a startup environment.

Starting a new startup means making decisions. Which framework to choose, what tool to use, which programming language, what task should be done before this other task, etc.

Starting is often overwhelming. What should be done first? If we ignore all the questions about the business (what sector? any specific niche? what sort of product?), then the first thing that an individual or a team should aim for is to prepare for iteration.

Many would start by working directly on their first project. It makes sense since it is the primary goal of your startup to produce results. However, writing code without establishing some sort of workflow framework will be inefficient.

My first step is generally to setup Jenkins, a continuous integration tool. It allows me to setup automated testing and automated deployment to a development/staging area/environment. This is useful for two purposes:

  1. Having an external "party" execute the test in their own environment (separate from mine). This validates that whatever is in source control will work on someone else computer.
  2. It deploys automatically "stable" (in the sense that they pass testing) version to an online facing server. With automated deployment, it is possible for me to keep on writing code, have it tested and then deployed to a server where I can ask others to take a look at and provide feedback.

There are a couple of way to get setup.

Everything will be setup on the same machine. Here is how it basically goes:

  1. Install jenkins
  2. Create two jenkins jobs, project-name-develop which takes care of building the develop branch of your repository and run the tests (basic continuous integration), and project-name-develop-to-development, which will again, build the develop branch of your repository but this time for the purpose of having it available online.

There won't be much to discuss here except a list of plugins that are almost mandatory (either because they make jenkins much more useful or allow you to more quickly diagnose issues).

  • AnsiColor
  • Checkstyle Plug-in
  • Clover PHP plugin
  • Credentials Plugin
  • Duplicate Code Scanner Plug-in
  • GIT client plugin
  • GIT plugin
  • HTML Publisher plugin
  • JDepend Plugin
  • JUnit Plugin
  • Mailer Plugin
  • Matrix Authorization Strategy Plugin
  • Matrix Project Plugin
  • Node and Label parameter plugin
  • Parameterized Trigger plugin
  • Plot plugin
  • PMD Plug-in
  • Self-Organizing Swarm Plug-in Modules
  • Slack Notification Plugin
  • SSH Credentials Plugin
  • SSH Slaves plugin
  • Static Analysis Utilities
  • Throttle Concurrent Builds Plug-in
  • Timestamper
  • Violations plugin
  • xUnit plugin

I'll now go into more details as to what each does.

  1. Pull the latest revision from the repository
  2. Download and update composer (if required)
  3. Install dependencies
    1. bower install
    2. npm install
    3. composer install
  4. Build assets to validate they compile
    1. Compile LESS into CSS
    2. Concatenate and minify JS
  5. Prepare the application environment
    1. Migrate database
    2. Seed database
  6. Run continuous integration tools to assert code quality
    1. phpunit
    2. phploc
    3. pdepend
    4. phpmd
    5. phpcs
    6. phpcpd

An iterative cycle here should take less than 5 minutes (and a maximum of 30 minutes). The goal is to quickly know after pushing changes to your repository that nothing is broken.

For this to work, you simply need to make a symbolic link from the jenkins project workspace to some path which apache/nginx makes available to external users. For example

/home/jenkins/workspace/project-a-develop-to-development/public -> /var/www/development/project-a

  1. Pull the latest revision from the repository
  2. Download and update composer (if required)
  3. Install dependencies
    1. bower install
    2. npm install
    3. composer install
  4. Build/Prepare website
    1. Compile LESS into CSS
    2. Concatenate and minify JS
  5. Prepare the application environment
    1. Migrate database
    2. Seed database

An iterative cycle here should take less than 5 minutes. Anything that takes longer than that would be suspicious.

Now that you have both projects setup, here's how things work. First, project-name-develop is triggered every 1-5 minutes and checks the repository for changes. If changes are detected, a build starts and will verify that the current state of the code is valid.

Once the build finishes, if it is successful, projecy-name-develop-to-development will start (triggered on project-name-develop success). It will deploy the stable code so that users may test it.

A whole change cycle will generally take from 1 to 30 minutes depending on how many tests you have and how well you've been able to optimize your jenkins build workflow.

Here's a list of things to try/check:

  • If you are running phpunit with code coverage, disable it and run it in a separate jenkins project. Code coverage is 2-5x slower than without it. When you are running the tests, you want to know the results fast and code coverage should not be a priority. Speed is the priority.
  • If you are running tests against a database and the tests requires setting up and tearing down the database (either just truncating the tables or full DROP tables), search for ways to avoid hitting the database or how to improve performance. For example, if you are testing using SQLite, run an initial database migration and seeding and copy the resulting .sqlite file so that it can be copied on test setup instead of migrating/seeding every time.
  • If migrating/seeding takes a long time, keep the resulting .sqlite file and only rebuild it if its source files (dependencies) have changed. On a project, you will run tests much more often than you will be rebuilding the .sqlite file, so it is worth investing in developing such a tool.
  • Since php is single threaded, look for tools that will enable you to do multi-process php testing. An example of such tool is liuggio/fastest. Depending on the number of processors/cores you have available, you could see a 4-8x gain in speed.
  • If you have the money/hardware, distribute testing over many machines. If you want a unified phpunit code coverage/results, you can use phpcov to merge separate test results into a single result file.

The following depicts how I “solved” a problem I recently had regarding munin, its mysql plugins and the shared memory cache library used by the plugins (written in perl and using IPC::ShareLite).

First off, let’s begin with a description of the problem. I posted the following on serverfault.com in hope I’d get help from someone more experienced than I am.

I’ve recently setup a munin-node on a CentOS server. All was working fine until I tried to add the apache plugin (which works fine).

For some odd reason, the mysql plugins for munin that used to work ceased to work… I’m now getting a weird error whenever I’m running the plugin with munin-run. For instance

munin-run mysql_files_tables

returns me

IPC::ShareLite store() error: Identifier removed at /usr/lib/perl5/vendor_perl/5.8.8/Cache/SharedMemoryBackend.pm line 156

but sometimes it will also return

table_open_cache.value 64
Open_files.value 58
Open_tables.value 64
Opened_tables.value 19341

but after a while it will revert to the previous error.

I do not have any knowledge about the IPC or the ShareLite library so I don’t really know were to start looking. Since it is a module related to shared memory, I tried tracking down shared memory segments with ipcs without much success.

I haven’t yet rebooted the machine as it is used for many projects (I’d obviously like to be able to diagnose the problem without requiring a restart if it was possible).

Has anyone faced this problem? (a quick search on google didn’t present any relevant help)

Thanks for the help!

Obviously, one can see quickly that this is a quite specific question that not many may have actually encountered. Thus, I didn’t expect to receive much help out of it (and I didn’t).

I had left this issue on the side for a couple of days hoping to come back to it at some point. Munin and the mysql plugins were installed on two servers and it was working fine on both of them (and a third one as master node). After a minor change, one of two client nodes stopped working correctly while the other was still fine. After a couple of days though the second server also decided to exhibit a similar issue…

Tonight I remembered about strace, which is pretty awesome in circumstances like this one. I went ahead and launched strace munin-run mysql_files_tables which outputted a lot of stuff and then stopped at the following point:

ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff13da8e30) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(4, 0, SEEK_CUR)                   = 0
read(4, "# Carp::Heavy uses some variable"..., 4096) = 4096
brk(0x163e7000)                         = 0x163e7000
read(4, "\n    redo if $Internal{$caller};"..., 4096) = 1737
read(4, "", 4096)                       = 0
close(4)                                = 0
write(2, "IPC::ShareLite store() error: Id"..., 123IPC::ShareLite store() error: Identifier removed at /usr/lib/perl5/vendor_perl/5.8.8/Cache/SharedMemoryBackend.pm line 156
) = 123
semop(14581770, 0x2ab08bb67cf0, 3

and when it is actually fixed, the application would end instead (outputting a bunch of stuff such as the following)

stat("/usr/lib64/perl5/auto/Storable/_freeze.al", {st_mode=S_IFREG|0644, st_size=706, ...}) = 0
stat("/usr/lib64/perl5/auto/Storable/_freeze.al", {st_mode=S_IFREG|0644, st_size=706, ...}) = 0
open("/usr/lib64/perl5/auto/Storable/_freeze.al", O_RDONLY) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffe7223570) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(4, 0, SEEK_CUR)                   = 0
read(4, "# NOTE: Derived from ../../lib/S"..., 4096) = 706
read(4, "", 4096)                       = 0
close(4)                                = 0
semop(917514, {{1, 0, 0}, {2, 0, 0}, {2, 1, SEM_UNDO}}, 3) = 0
semop(917514, {{2, -1, SEM_UNDO|IPC_NOWAIT}}, 1) = 0
semop(917514, {{1, 0, 0}, {2, 0, 0}, {2, 1, SEM_UNDO}}, 3) = 0
shmdt(0x7fc30021f000)                   = 0
semop(917514, {{2, -1, SEM_UNDO|IPC_NOWAIT}}, 1) = 0

What you can see in the first output above is pretty interesting. The semop call gives you the semid the process is trying to obtain (the semaphore used to synchronize different processes using the same shared memory). The signature of the semop function is as follow:

int semop(int semid, struct sembuf *sops, unsigned nsops);

semid: semaphore id
sops: pointer to a sembuf struct

struct sembuf {
    u_short sem_num; /* semaphore # */
    short   sem_op;  /* semaphore operation */
    short   sem_flg; /* operation flags */

nsops: the length of sops

Upon first inspection, you can see that the sembuf in the first case seems to be invalid if you compare it with the working version where it is actually resolved (strace displays something such as {{2, -1, SEM_UNDO|IPC_NOWAIT}} instead of 0x2ab08bb67cf0. But that is not helping me much.

With that semid you can do two things: first, you can check if it is still alive by calling ipcs, second, you can remove it with ipcrm -s semid.

In my case the “fix” itself was to remove the semaphore that the plugin wasn’t able to obtain (the reason of this still elude me though). After the removal of the semaphore, it is possible again to run munin correctly and the identifier removed error is gone.

I will have to do more research as to how/why this issue occurs as I’ve seen it happen only on CentOS machines so far (the master server is a Debian machine).