16 Feb 2020

How to lead a large AGI company

History / Edit / PDF / EPUB / BIB / 2 min read (~205 words)

How would you lead an AGI company with 100,000 employees?

I would separate the employees into multiple smaller companies, as large companies are difficult to wield. Furthermore, I think that it is useful for different companies to work on the same problem using different approaches, which is something I would promote. I see the need for a variety of positions:

  • (30%) Tooling and core technologies: Building tools that are used by other employees to make progress (visualization, compilation, hardware, database, network). (Bachelor/Master/PhD)
  • (25%) Applied research: Put the results of fundamental research into application in a variety of products. (Master/PhD)
  • (15%) Fundamental research: Work on scientific theories in order to improve our understanding of intelligence, learning, doing science, solving problems, programming, etc. (Master/PhD)
  • (15%) IT: Deal with infrastructure management and scaling. (Bachelor/Master/PhD)
  • (5%) Management: Ensuring that work is going in a specific direction and is not a random walk. (Bachelor/Master/PhD)
  • (5%) Data collector: Acquire data necessary for experiments done by fundamental researchers and applied research scientists. (Bachelor)
  • (5%) Administrative/HR/Facility management: Deal with business related tasks such as people management, facility management/maintenance, etc. (Bachelor)

15 Feb 2020

Healthy software company

History / Edit / PDF / EPUB / BIB / 1 min read (~172 words)

What are the signs of a healthy software company?

The following signs are defined assuming a web development software company.

  • Projects are completed on time and under budget.
  • Development is supported by continuous integration practices.
  • Tests are written for the code developed.
  • The different stages of development are not rushed so that coding happens as soon as possible.
  • Projects are put in production and monitored.
  • Events requiring intervention in production are handled without the presence of a large amount of stress.
  • Employees have time to share their knowledge of the codebase with one another.
  • Code is reviewed before being merged into the master branch.
  • Version control is used.
  • Processes are documented, followed, and updated when necessary.
  • Traceability is possible from clients' requests to their deployment in a live environment.
  • Most of the system has been designed beforehand and only minor sections of the design need to be updated during the sprint iterations.
  • Priorities have been established and are well documented.

15 Feb 2020

Photo geolocation

History / Edit / PDF / EPUB / BIB / 2 min read (~225 words)

I have taken a lot of pictures and I'd like to see where they were taken on a map. I also don't want to have to upload those pictures to a server or download some software.

I've developed a simple tool called Photo geolocation which is a small client-side application. It uses leaflet to display the images on an OpenStreetMap map. The EXIF image metadata is extracted from the image by reading the image data using exif.js. From the EXIF image metadata we extract the GPS coordinates, which are then used to place a pin on the map at the appropriate location. It is possible to click on the pin to see what picture was taken at the provided location.

To use the tool, simply go to the website and drag and drop your images on the page. The images will be read by your browser, the EXIF image metadata read directly by the browser (nothing being sent to a server). The images with GPS coordinates will have a pin displayed on the map at the location they were taken, while those that do not have GPS coordinates will be simply logged to the browser console and not displayed.

14 Feb 2020

Vocabulary gradient

History / Edit / PDF / EPUB / BIB / 2 min read (~272 words)

I write a lot of articles and I want them to be understood by most people. How do I use the most common language possible?

My approach is to write whatever I want to write about using whatever language I come up with first. Then I use a tool I've developed which I've called the vocabulary gradient. It is a very simple tool where you will generally copy and paste the article you've written and look at the result of the analysis. The tool uses a word frequency list as specified in the README.md. This list was built using the Project Gutenberg library, which makes the word frequency list a bit outdated.

The report generated by the tool presents the minimum, average, maximum and standard deviation of the index of the words used in the text you provided. Those numbers give you a rough overview of the difficulty of your text based on word frequency alone. The lower your average and maximum is, the simpler the article should be to understand An histogram is also generated, where the bins are based again on the index of the word in the frequency list. Finally, the provided text is rendered with each word index as a subscript. Words that are unknown are highlighted in yellow, while words for which the index is high are shaded with a darker shade of gray as their index increases.

With this information in hand, you can spot the words that have high word frequency indexes and try to replace them with lower index words.

How can you keep a website up to date and yet have previous visitors recognize new content as fast as possible?

As I am a developer, what is the most straightforward answer to this problem is to use a tool such as diff. When I write articles on my blog, I use Visual Studio Code which I have configured to save on window getting out of focus (or the current tab of the editor being changed). With this save event, I also create a git commit automatically with a very boring message "Automated save from VS Code.". The point is not to have a fancy commit message, but to have a trace of when the changes where made. This allows me to offer to my visitors the ability to view the history of changes that were done to an article.

The downside to this approach is that it is not very easy to diff the article between two versions using the github web UI. It requires manually playing around the url, to provide the base and latest article SHA1 hash and to find the article in the list of files changed, which makes the experience rather painful and likely to lead nobody to do it.

Given that the git repository is available on my server where the blog is hosted, it would be possible for me to run a git diff command provided the last version seen by the visitor. This would allow me to present the changes that were done since the visitor last came. For instance, removal of sentences would be simply not displayed as removed since it is likely to be irrelevant to the visitor, however new sentences would be highlighted in green.