03 Feb 2020

Repeating yourself

History / Edit / PDF / EPUB / BIB / 2 min read (~279 words)

If you offload everything in your head into documents, how long does it take before you start repeating yourself?

How soon you repeat yourself will depend on how much diversity there is in your thoughts. If you always think about the same problems in the same ways, you're likely to repeat yourself a lot. However, if you try to explore the same problems through different lenses you'll have less chances to repeat yourself.

If you spend most of your time exploring new ideas, then you may only come back to ideas you had in the past from time to time.

If you work in a specialized field you may have to learn the basics multiple times in order to master them. Sometimes you will have acquired new knowledge that might challenge your existing assumptions. You will be mostly repeating yourself, but you will be making small adjustments at the same time.

When we offload everything that we have in our head into documents, our goal is to have more working memory space for our current work. We want to avoid having to come back to the same topics and having to start our exploration from scratch.

When we're exploring, we want to avoid exploring the same topics without noticing. As we explore a field, we may get a sense that the field is very large and that it may take a long time to get familiar with it. By mapping the field we can get a better sense of its size while at the same time discovering the important concepts. This lets us identify when we are making use of the same concepts over and over again.

02 Feb 2020

Spread of corruption

History / Edit / PDF / EPUB / BIB / 2 min read (~222 words)

Should we let employees from corrupted companies apply to other companies?

I am of the opinion that it is better for corruption not to spread.

I'd hope that by having the (potentially) corrupted individuals join other companies, that the culture of the companies they join would prevent them from corrupting those companies. Either the corrupted individuals would have to stop being corruptors (in the future, which would be ideal), or be evicted out of the "healthy" company in order to avoid fostering this behavior.

One issue with corrupted employees spreading to other companies is that it is difficult to identify them as they work in those various companies. It is possible for us to establish a blacklist of companies that exhibited bad behaviors and avoiding employing people who worked there. This approach may however punish employees that are not corrupted. As larger and larger companies exhibit behaviors that might put them on this blacklist, the number of individuals ending up on that list may be too large, reducing the pool of candidates too greatly.

What companies and individuals within healthy companies need is a way to identify individuals that are likely to be corrupted or corruptible. The difficulty is however that those individuals are likely to be cunning, which means that no single technique will always succeed at identifying them.

01 Feb 2020

Thoughts tracking

History / Edit / PDF / EPUB / BIB / 2 min read (~366 words)

How do I track my thoughts?

When I am on the go, I mostly rely on Google Keep to write down what on my mind. I use it because it loads fast, is straightforward and allows me to quickly dump my thought and go back to what I was doing. Since I'm always with my phone, it's always within reach. It also will synchronize with Google when I'm online so it is available on any other devices. This also means that I may sometimes record things either directly from my computer, my work computer, my phone or my tablet, depending on the medium I'm using at the time (defaulting to my phone if I'm not using any device when the thought crops up).

When I am at home, I've devised a simple system in Visual Studio Code where I use two keyboard shortcut, one that open today's buffer (CTRLNumpad 2) and one that inserts a datetime and note id on the current line (CTRLNumpad 0).

When I am at work, I use this same system to take notes pertaining my work. I generally try to organize my thoughts per projects so that I can go back to any specific project and re-read the notes to get back in context. I also will write down notes related to issues I'm working on and identify them using the issue ID given by JIRA.

In both cases (at home and at work), I also have configured Visual Studio Code to commit automatically to git changes that are done in markdown files when the editor focus changes. This allows me to have a somewhat granular log of the changes that happen to my note files. At work, I have configured a cronjob that automatically pushes the notes to my private git repository. This allows me to pull those notes at home and read them whenever I want. I also push my personal notes to my private git repository, but I do not pull them on my work computer because I haven't had the need for it.

31 Jan 2020

Questions to think about a problem differently

History / Edit / PDF / EPUB / BIB / 2 min read (~380 words)

What simple questions can be used to think about a problem differently?

I've learned to solve problems by thinking about what I already knew. I would try to think of parts of the solution that might make sense and I would piece those parts together to get from the problem to the solution.

While I was doing some hobbyist research on AGI, I read George Pólya classic "How to Solve It". Pólya covers many of the questions I would ask myself implicitly, so I was glad to see that someone had taken the time to write them in a reusable format.

His four steps of 1. understanding the problem, 2. devising a plan, 3. carrying out the plan and 4. looking back will lead you to ask yourself what you know and don't know about the problem you are trying to solve. Doing so is similar to Feynman's technique, where you attempt to teach what you know to someone else and by doing so, are discovering the parts of your explanation that needs improvement.

What I liked the most about Pólya's approach was that he was comfortable working with partial solutions. If you couldn't figure out how to get to the end, it was still important to put forward all the tools you had at your disposition in order to attempt to solve the problem. This way you would be able to get an idea of what was lacking in your solution.

  • What is not known yet?
  • What data do you have?
  • What conditions are there?
  • Is it possible to satisfy all those conditions?

  • Have you seen this problem before?
  • Have you seen a similar problem in a slightly different form?
  • Do you know any related problems?
  • Do you know something that could be useful to solve this problem?
  • If you cannot solve this problem yet, can you solve a related problem?

  • Can you see clearly the steps from beginning to end?
  • Can you prove that your approach is correct?

  • Can you check your solution?
  • Can you get to your solution differently?
  • Can you use your solution to solve other problems?

30 Jan 2020

Tools I use daily and can contribute to

History / Edit / PDF / EPUB / BIB / 5 min read (~952 words)

What are the tools I use daily that I could contribute to?

  • Visual Studio Code My text editor of choice when I don't need an IDE. I've implemented a few plugins for VSC and use it daily to write personal notes as well as blog articles. I sometimes make use of its diff tool to merge changes from Sourcetree.
  • Pycharm My IDE of choice when I write python. Very powerful, easy to get used to if you've used other Jetbrains IDE (I've used PHPStorm for more than 5 years). Extremely useful to run a specific unit test using pytest or to debug a complex issue by putting breakpoints and investigating the internal state of the program.
  • Docker I use docker at work to containerize all of our dependencies so that it is somewhat easy to deploy what we develop in "any" environment, that is, an environment where docker (or similar, such as Kubernetes) is installed.
  • Drone CI We use Drone CI at work to do continuous integration and I consider this tool to be an essential part of my daily work. When people push code to github and it fails on Drone CI, I can use this information to help them fix their issues. We also use it has part of our PR process to ensure that the PR passes all the expected tests so that we do not introduce faulty code into our master branch.
  • Dependabot A few months ago I had introduced dependabot into my dependency management practices. It was highly useful to get automated PRs with updates to libraries we depended on. However, since the release of poetry 1.0.0, dependabot has not been able to update my python dependencies and has been left unused. I've created a PR which I hope will move this issue forward and get dependabot working again with poetry.
  • Plotly I used to use the highcharts plotting library until someone at work introduced me to plotly. Plotly.js is open source software and released under the MIT license, which makes it an ideal library to use in personal as well as commercial software.

  • Pandas I do machine learning development for a living nowadays and I depend highly on pandas. I don't think there's a single work day that goes by in which I don't whip out at least one pd.DataFrame.
  • Scikit-learn Similar to my dependency on pandas, my dependency on Scikit-learn is on a daily basis. Unlike the DummyRegressor documentation suggests, I use it for real problems and it's definitely useful!
  • Dask / Distributed In order to scale both horizontally and vertically machine learning problems I've leaned on dask and distributed. Their use of delayed and Futures has made it simple to migrate simple for loops code into highly distributed tasks which can be monitored through a bokeh dashboard.
  • pytest Who writes code without testing it? Pytest is the PHPUnit of python for me, an essential component that is used daily to ensure that code doesn't regress more than it needs to.
  • mypy Python typing system is pretty weak in my opinion. I miss using PHP typing system, as well as its visibility system. Mypy is similar to doing a code compilation pass and verifying that the types specified in a function signature are the types of the arguments given to that function. It is useful in order to detect mistakes in the arguments being passed to a function.
  • isort I'm a tidy man. I like when my imports are ordered alphabetically. That's what isort is there for.
  • black I don't particularly like discussing code style with others because everyone has their own quirks and creating a code style that everyone agrees on is as difficult as agreeing on whether tabs or spaces should be used. black is highly opinionated and doesn't allow for much to be tweaked, while it also has a sensible style that sometimes can make you crazy.
  • prospector Prospector allows you to run a variety of linters on your code, which is quite useful when you like your code to be as standard and pretty as I like it. Some of the tools also look for code complexity, which helps you identify nightmares before they're in the master branch.
  • poetry I've used pip, I've used pipenv, requirements.txt, setup.py, etc. I didn't like the setup.py because since I've used composer (for PHP), I've always seen dependency management as something that shouldn't require code to define. I didn't like the requirements.txt/.lock variants because it was never clear how those were generated and if they were kept up to date together since you could use the requirements.txt as soft dependencies and requirements.lock as hard dependencies that you had to freeze yourself (which many people didn't know about). pipenv Pipfile was alright, but adding dependencies seemed to take longer and longer, which wasn't a pleasant experience, especially when the package you wanted to add didn't want to play nice with the other packages. poetry was the closest experience I got to composer.