..

2019 Tattered

Since 2018, I’ve moved to Tokyo and worked as a web engineer maintaining services in various Cloud environments. Since this August, I’ve moved to the newer team and dedicated to build a software called ‘KUSANAGI’ which is a CentOS Environment dedicated to the ultra-fast web server with lots of presets and adaptation of convenient tools.

A lot of newer approached were required and had to revise lots of things that I learned years ago to make better decisions. But I can summarize this year as an exciting journey with full of adventures. I tried to share what I’ve earned during this journey, things that I’ve archived, and other opportunities that I didn’t took which I could’ve done.

What I’ve Earned

Deeper understanding of OS (especially Linux) and Server maintenance

Since May 2018 my main job was to maintain running applications in various Clouds, and Linux was the only key to access and discover any faults. I was quite confident about my searching skills and sharp deduction to find clues, but grep or find never brought me to the deep water. I have found other helpful tools such as xargs or sed for batch jobs, and to handle any scheduled job wrote simple shell scripts and sometimes have used crontab to register any cron-job as I wanted.

tmux worked like a charm. With the help of my friend, I arranged my tmux copy/pastable from any sources with mouse clicks, and multitasking on tmux became fundamental. And I still try to find any better improvements on my WSL Ubuntu environment. (Cannot wait WSL2 to release!)

Web Server Fundamentals and how-to CRUD configurations from a blank sheet

In 2018, nginx and httpd were some tools to turn on and off, and configurations were things to paste and patch newer directories around. And of course, it was not enough. I had to figure out how Web server application (especially linux) works as a daemon process, how to integrate with different CGI in order to benchmark a performance over websites.

Configurations’ syntax was unique, but through maintenance over lots of different applications, I understood how each configuration interpreted by web servers and better arrangement for frequent misuses. I was able to find similar functions for both httpd and nginx modules, and debug any errors from logs easily.

Web Security Fundamentals and basic server-side approaches to prevent leaks

KUSANAGI mostly uses WordPress as a platform, and as it is the largest open-source CMS, maintain security is the main key to our users. One of my daily tasks was to check vulnerability reports and seek out if any vulnerability issue is related to our environment.

There were few incidents to deal with, but mainly I discovered how many services become vulnerable and how others attack through that weakness. A lot of WordPress plugins were reported because of unescaped queries, insecure GET/POST methods, etc; This rich experience helped me A LOT when I started to build application and review codes in the team. (As I’ve seen real issues causing the problem.)

Deeper understanding of Cloud Environment (especially AWS) and PaaS maintenance

As I mentioned earlier, hundreds of customers’ applications were deployed in Clouds, so I had to follow up ASAP. I dedicated to study Cloud Environment solutions Architecture and took a shot during my job. Daily jobs were related with managing Instances, Load balancers, Network settings such as Security Groups and ACL, few PaaS such as S3 or Lambda. I mainly used AWS, Microsoft Azure GCP, and also used other japanese local cloud hosting services. For the maintenance, each audit services helped me to understand both Cloud and Linux machines’ status, but I also used Zabbix on each instance to double-check, automate some safety triggers to pull when there’s minor problem.

Deeper understanding of WordPress Core and built-in APIs

I have been using WordPress since 2013 but last year, I had to get into the complexity of enterprise level WordPress. To be honest, I’ve encountered really messy WordPress with some nasty must-use plugins with no maintenance for years, and fixed some random plugins with flaws. Some were easy, but some required refactored structure in order to avoid vulnerability, or to replace any out-dated WordPress APIs.

My angers were relieved through discoveries in WordPress Core APIs and reading true intentions that developers have taken. I would give the MVP to Hierarchy System and all Action hooks that allow you to interfere any WordPress core functions. It helped me to understand how complex PHP environment works, and how to provide a dynamic control over whole system through APIs.

Deeper understanding of Lua manipulation with httpd.mod_lua and nginx.openresty

Newer KUSANAGI has many interesting features, and one of them is web server Post-Processor. To build a universal plugin over any websites built in any languages or environments, we built a web server (Apache and nginx) software plugin. As both web server software has Lua module adaptation, we built quite a number of fascinating algorithms that will post-process any CGI’s results into clean, fast, re-written HTML which gives a superb speed to any modern browser.

I found Lua very interesting, especially very strong commands such as gsub and each module on web server with full access over any request and response data for easy manipulation.

Deeper understanding of Web Fundamentals and performance tuning approaches

Our goal is to handle a super-fast website to our customer and provide an automated setting that allows users modify their website without affecting its speed. Therefore, we should accomplish such a mission on a basic level: “What is a delightful web experience”. Our guidance was the Lighthouse score and its audit results.

From preload to lazy-load, delivering accurate ‘First Content View’ without disrupting any inline scripts firing all over the websites, every method and approaches we took were made under the philosophy of how browser tries to deliver the website as fast as possible. It was a great experience to watch websites becoming more faster, and fascinating to discover that the same approach works perfectly on other websites with no customization.

Node.js Fundamentals and basic OOP approach building a Node application

I knew how to develop a JavaScript application (including jquery and ajax) for front web development, but This time was different. Our new development project’s main framework was Node.js, and I dedicated to specific functions: a web-crawler. I designed a whole object with asynchronous functions full of promise all and async/await pairs, as crawler could take different amounts of time each website he visits (due to network, website’s size, etc.).

Honestly, I barely understood the Asynchronous Application’s basics and got an idea of how this could be efficient in time. I wish I could take further making this crawler parallel procedure with multiple pools, in order to give more asynchronous handling spices into this fancy soup.

Puppeteer and Chromium Fundamentals to build a parallel web crawling application

Maybe this was the main reason why the whole system was built in Node.js. That was the most fun but the most hard-to-debug thing during this year. I have a sort of web-automation experience with old-fashioned Selenium and Firefox’s Selenium IDE, but this time was more focus on Access-and-get activity itself. The whole project gave me a broad idea about what the browser is, and how puppeteer is built to manipulate a headless browser with ease.

It is still a headache when Chromium itself is crashed with no reports at all (maybe stuck with no response). I found how other people avoid this problem by building a parallel cluster with multiple puppeteer workers retrying until it gets 200, and maybe I might drive a whole project to parallel puppeteer. (I still wish to use only one solid browser at one instance.)

Image Optimization (Especially GIF)

This old animation format took me to another deep rabbit hole. Still images (JPEG, JPG, PNG) were just a piece of cake compare to GIF. As it weighs a lot depends on size, quality, frames, number of bits and etc; I had to introduce various methods simultaneously and see which one gives lighter results, without breaking animation or resolution.

After crafting all neccesary algorithms to take into each command, used a classic A/B Testing between different tools: imagemagick and gifsicle, then let it decide which image to store. Also, every image was again converted in to .webp, so modern browser can shoot image ASAP. (In case of IE, it just gives GIF. Post-proccessing enpowered by Lua did this through web server.)

Seems like image processing still mostly counts on Imagemagick, but I found gifsicle very interesting, and wanted to dig more about it.

Deeper understanding of Database (especially SQLite) and Basic approaches of performance tuning

Primal crawler used ioredis, native redis module to save every crawled datas and process it later. The problem starts when data gets larger and consumes lots of memory. I choose sqlite and built a relational database instead of MongoDB, as 90% of queries executed were SELECT, which I thought sqlite with WAL mode might deal it faster than non-relational databases.

Most of my experience with databases was related with single query performance, but this time I used sqlite in a broader way then I expected. Distribution of queries and execution control with TRANSITION were very critical, as no algorithm needed any manual control and node.js tried to handle multiple jobs asynchronously. Indexing and constraints also accellerated few complex SELECT (with JOIN arguments) queries.

Windows Server Fundamentals and how-to CRUD IIS Configurations for PHP CGI

Just scratching the surface of Windows Server. Due to in-house project related with PowerPoint Automation, I worked with Windows Server for two weeks. I built a really simple PHP API scripts to communicate, and ran an IIS with PHP’s fastCGI to execute php scripts from windows server.

What I struggled most was the PHP CGI’s error, since curl won’t work in IIS without manual configuration of certification for SSL communication. This is more like PHP CGI Setting on Windows, but mostly every Windows Server IIS will need same configuration so…

PowerShell Fundamentals and how to schedule a PowerShell script task

Other scratches over the windows surfaces. This was my first time to work with PowerShell, and had to get familiar with its syntax first. With basic garbage collection and simple foreach syntax, I was able to build a simple PowerShell script which merges PowerPoint Presentation files into one.

I used a common Task Scheduler to run this script every 10 minutes, but my script was using COM interop objects to manipulate running PowerPoint on foreground. And script won’t work properly if it runs in background, as interop wouldn’t be able to grab a running PowerPoint. Therefore, I scheduled a task to run only when user is logged on (this way script will be loaded to new foreground powershell), and manipulates PowerPoint application with no problem. Maintaining Windows Server logged on was much easier job with mstsc.

PowerPoint automation with basic COM Interop manipulation

PHP has a great library callled PhpOffice in composer, which allows php to CRUD any office file in Microsoft Office Open XML with built-in APIs. As I had to reproduce tons of Presentation files with each data from different customers, I designed it to bring a whole data from JSON, and build a whole presentation automatically with php script.

Even though I created a presentation with data, I needed to automate a PowerPoint with Powershell to merge it (which was not a proper job for PhpOffice) with few templates for final result. I found this piece of gold, and using it finally could made it. After few weeks of usage, this in-house project was called done.

Deeper Understanding of Lean & agile development and clever uses of Git such as git flow

Last but not least, team development strategy with clever git usage gave me an idea of how efficiency can change our performance better. When I worked with server maintenance, we had thousands of different environments and git was not even a solution for that (I fought with Ansible, Zabbix and cloud CLIs).

But since I started to work with a project deploying new things daily, teamwork became essential. We needed a plan to deal everyday’s tasks and deploy rules with pull request strategy. I found gitflow very fascinating way to run a project with multiple tasks, and this allowed me and our team to be agile every decision we take. branching new ideas from develop stream or from master makes things unforgettable and recored.

What I’ve done

With what I have earned, I did things like:

AWS Certified Solutions Architect Associate

Yes. I passed the exam last July. As a developer cloud environment is unevitable, and more knowledge about it promises more efficiency in both production and costs. I guess whole experience of brief reading every AWS service helped me understanding the philosophy of cloud computing and become a part of it.

Created Dedicated CDN manually with lsync and proxy configuration with nginx

Few customers got boomed and traffic tripled every day (Like a question from Associate exam…). Their happy scream became a certain worry that their server won’t hold it for long time. I built servers who receive image data with lsync, and allowed web access to serve as CDN. DNS routed URL allowed nginx to rewrite its url from origin to CDN server.

Built highly available architectures with Load Balancer (L4 & L7) in AWS, Azure, GCP, and other local cloud services

Same problems but different approaches were taken. It was quite a challenge to face both physical load balancers and Cloud load balancing applications, but after all made it with no big issue.

Contributing core algorithms which allows Web Server to deliver fastly-loaded websites

During last 3 months my main task was to develop algorithms to deal with common automation issues in a newer KUSANAGI. And most of them were struggles with regular expression, which was one of my weakness. I wouldn’t say that I got it now, but at least I’m confident to use regular expression to cherry-pick information from HTML like how cheerio does.

What I’ve wanted

I wish I could have done these things, but unfortunately these are what I couldn’t manage through.

Artificial Intelligence approaches in algorithms’ decision making

Our new software has lots of decision-making process that requires an artificial criterion. From the beginning we set those criteria by our hands, and wanted to let AI to decide which one to use; for example, to clusterize similar websites we used DBSCAN method but it was very partial in a whole project.

Python refactoring cases and newer approaches like PyTorch

As most of KUSANAGI was written in shell scripts, we wanted to refactorize to modern language such as Python or Node.js. First reason was the urging request of portable Docker images, and second was the lack of API Access to KUSANAGI control panel (which is a disgusting job for shell scripts). PyTorch was one of those libraries that I was keen on, but I just admired it from distance during whole year. As I learned Lua this year, maybe I can go directly to torch itself.

React and another Front Web Development

My last job with JavaScript framework was in 2014, with famous AngularJS. Since then, I saw how trends were moving from older to newer frameworks, but had no opportunity to use them in field. Next year I may have an opportunity to build a control panel, and use React to get used to it.

Deeper understanding of Cluster and Parallel computing

As I mentioned earlier, I barely have a theoretical idea of what parallel computing and multithreads are. I wish I could have done more things related with, but not big chances. Next year I might face parallel processing challenges with Node.js (even though its basic is single core threading) and cluster, but still have to study more.

A complete reading of ‘Operating System Concepts’

As a developer, I meet with different OS everyday. To re-visit what I’ve learned with Information Processing cources, I wanted to read The Dinosaur book again and get a clue from it. Due to my laziness I couldn’t but still this is a way to go.

AWS Certified Solutions Architect Professional

AWS SA Professional is next level challenge, as it requires lots of knowledge and flexible approaches with Cloud computing theory. But from the experience with SA Associate and what I’ve learned, I believe this exam will help me to get me on a next level.

AWS Certified Developers Associate

Lately I discovered how efficiency in team-level trains developer much faster. We’re in a verge of introducing new tools such as Azure and AWS DevOps, so I found this exam as a good eye-opening introduction for Cloud developing environment.


Closing

It took few hours to summarize it but worthy enough. I wish I could get better next year and share my thoughts more often, as I’m pretty sure that I’ve forgotten a lot of things that I’ve done this year.