14 Feb 2020

Vocabulary gradient

History / Edit / PDF / EPUB / BIB / 2 min read (~272 words)

I write a lot of articles and I want them to be understood by most people. How do I use the most common language possible?

My approach is to write whatever I want to write about using whatever language I come up with first. Then I use a tool I've developed which I've called the vocabulary gradient. It is a very simple tool where you will generally copy and paste the article you've written and look at the result of the analysis. The tool uses a word frequency list as specified in the README.md. This list was built using the Project Gutenberg library, which makes the word frequency list a bit outdated.

The report generated by the tool presents the minimum, average, maximum and standard deviation of the index of the words used in the text you provided. Those numbers give you a rough overview of the difficulty of your text based on word frequency alone. The lower your average and maximum is, the simpler the article should be to understand An histogram is also generated, where the bins are based again on the index of the word in the frequency list. Finally, the provided text is rendered with each word index as a subscript. Words that are unknown are highlighted in yellow, while words for which the index is high are shaded with a darker shade of gray as their index increases.

With this information in hand, you can spot the words that have high word frequency indexes and try to replace them with lower index words.

How can a project be well executed through consensus instead of leadership?

To properly execute a project without a leader that makes important decisions, time is one of the most critical components. Without enough time, decisions are not consensus but decisions made by the individuals available at the time the decision is required. It is also important not to rush things as it will simply lead to bigger and bigger mistakes happening more and more quickly. At one point a few or many team members will realize they've moved too far too fast and many of the necessary pieces are missing, which makes the work they've done so far either irrelevant or of low value.

The team members also need to know each other well enough to know their strengths and weaknesses. Without this knowledge, weaknesses are often ignored because they are the weaknesses of everyone.

Decisions need to be reviewed regularly to ensure consistency. Since there is no single leader that keeps the project in their head and wants to see it to completion, it is necessary to ensure that the work that is planned and executed is consistent towards a common vision. Not doing so can lead the team to work on features that go in opposite directions, features that are not aligned with the users the project targets. Acceptable moments to review those decisions are during the planning sessions and the review/demo sessions.

Team members should often work in pairs with a different individual in the team to ensure that their understanding of the project gets discussed with different individuals with different positions. This avoids working with the same person that shares your opinion of the work to be done, while some other individuals may completely disagree with this opinion. Surfacing such differences of opinion early in the development process is critical since it can lead to work that is not aligned with the project.

Overall, we want to reduce as much as possible the time between the moment an individual or group of individuals have an incorrect understanding of the project's goals and the correction of their understanding.

13 Feb 2020

Data anonymizer

History / Edit / PDF / EPUB / BIB / 1 min read (~186 words)

I want my clients to share with me confidential data without revealing what the exact values are so that I can train machine learning models on this data.

I wrote a simple python package that uses pandas and scikit-learn to apply some simple transforms to the data. Some transforms that are applied to the dataset can change the distribution of the data, changing its statistical properties, while others preserve them but simply rescale the domain.

Given an anonymizer dataset using this tool, it is possible to do a preliminary data audit and possibly train machine learning models on the data to give a quick idea to clients whether their data looks promising or not without actually revealing the true numbers (except if desired).

The main concern with this approach is that most clients are not technical, and thus having them anonymize their data is generally not easy, if not impossible. Thus it means that such a tool is currently not applicable in the desired context.

12 Feb 2020

Visual Studio Code Run Me extension

History / Edit / PDF / EPUB / BIB / 2 min read (~228 words)

I frequently run the same commands with different parameters but I have a terrible memory. I also use Visual Studio Code a lot.

I developed an extension in 2018 called Run Me whose goal is to allow you to define commands that you can customize through a form, which is a series of questions that will be asked to you, before launching the command with the parameters you provided.

I've used it to do all kinds of things, from launching OBS to resetting the Windows 7 visuals when it lowers them down due to low memory. I also use it to automate various tasks such as creating new articles using a template, opening my buffer document that I use on a daily basis to write notes and more.

Here's an example of my configuration file which I use to start OBS and to reset the Windows 7 visuals.

"run-me": {
    "commands": [
        {
            "identifier": "start_obs",
            "description": "Start OBS x64",
            "command": "\"C:\\Program Files (x86)\\obs-studio\\bin\\64bit\\obs64.exe\"",
            "working_directory": "C:\\Program Files (x86)\\obs-studio\\bin\\64bit"
        },
        {
            "identifier": "reset_visuals",
            "description": "Reset W7 visuals",
            "command": "sc stop uxsms & sc start uxsms"
        }
    ]
}

12 Feb 2020

Measuring success

History / Edit / PDF / EPUB / BIB / 2 min read (~258 words)

How will you measure your success over the next year?

Over the past few years success for me has been defined less by goals and more by being able to work continuously on a process. For example, I might want to get better at writing. My goal is not something like "write one technical book by the end of 2020" but rather "write technical content daily". This decreases the pressure and the need to perform while allowing me to do what I want.

I feel happier and more successful when I can keep doing the same thing over and over, even though I may have no motivation. It shows me that it's possible to accomplish pretty much anything, as long as you're willing to put in the effort.

Because I evaluate success this way, it is easy for me to track if I'm successful. I use the Loop Habit Tracker (an android app) to track whether I've worked on something I told myself I wanted to improve. My list started small and contained very mundane things such as "Bed is done", "Empty table", "No dirty dishes" and started including more and more habits, such as "Read 1 wikipedia article", "Answer 1 problem" and "Answer 1 question".

Being able to keep a good habit is what defines success to me. With the help of the app I'm able to add more and more habits without forgetting the previous ones since I'm reminded to do them at the desired weekly frequency.