Dec. 2013

"Because it's easy and convenient." - a user

Last Modified: Thu Dec 26 11:50:21 UTC 2013

Temporary Internet Suicide till Jan. 3rd 2013-12-26 [Thu] 20:23

Let's see what will happen.

ATrulyInternationalizingCode 2013-12-25 [Wed] 17:28

When you're truly i18nized, you call Japanese and English "CJK" and "Western" respectively.

Wut. 2013-12-25 [Wed] 17:25

!?

(yah, nmhg.)

eeeee-Japan 2013-12-23 [Mon] 23:47

I've always thought that the Japanese government is the worst in hiring private contractors and making websites, but maybe the situation isn't that different anywhere else. It's not that surprising though, considering how a government works (liability, lack of technical understanding, and sheer amount of bureaucracy and sectionalism, etc.)

But again, there's another (more fundamental and serious) problem:
"There's no easy way to accurately measure the software quality for average people (or even experts)."

This. 2013-12-22 [Sun] 19:09

A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools.
-- Douglas Adams

Power of Kernel Oplocks 2013-12-22 [Sun] 11:46

Another old thing that I long wanted to write down...

I have two machines: Windows and Linux. I use Linux as a file server among other things, and sometimes I modify the same file from Windows and Linux. But I noticed that these files are not very well synchronized. Windows files often tend to be old, corrupt, or simply missing.

Initially, I thought that this is a clock syncronization problem, so I double checked the Samba config files and ntp settings, ultimately in vain.

As it turned out, you shouldn't do it.

The reason that you shouldn't do it is because Samba doesn't detect that a local file is changed. Furthermore, a Windows program acquires a "lock" of the file so that Samba server has to guarantee that the file is not externally modified.

One way to address this is to access everything via Samba/cifs, even from a unix process.

But this feels a little bit too much. Then I found a Samba feature called "kernel oplocks". By enabling this, now Samba can notice local file changes and take an appropriate action to Windows clients.

Conclusion: Yay. (But using this option makes file operation somewhat slow. Use with care.)

Interruption, Hierarchy and Management 2013-12-21 [Sat] 19:58

Suppose that you work as a part of a big organization. You always get your orders from a supervisor. One day, you spot a certain problem. You report the problem to your boss. But what if the problem is very very serious and worth attention of higher-ups? Even without all typical "human" issues (losing face, etc.), this is already a tricky problem because the flow of control is reversed here.

This is similar to software/hardware interruption. In a normal situation, a control is passed from the top to the bottom. But when an interruption happens, it goes from the bottom to the top. In computer software, interruption is often tricky to handle correctly, probably because of this very reason. Now, if it is difficult in software, isn't it also difficult to handle them in real life? This is something I'll have to think of when there's a time.

Throughput versus Latency 2013-12-21 [Sat] 14:19

Lately, there's some gripes about Twitch adding extra delay to their livestream feeds. Unfortunately I think this change is somewhat permanent, and there's no easy way to go back. But it's not all Twitch's fault. Let me explain how this can happen and why there's no easy way to get out. (Disclaimer: the following explanation is entirely my own speculation; I have no knowledge about the inner workings of Twitch systems.)

First of all, let's do a little math: how many viewers can one Twitch livestream server handle? According to the Twitch's own recommendations, each livestream takes, say, 1Mbps to 3Mbps. How much bandwidth can each server machine get? Well, depending on a datacenter, but it's usually less than 10Gbps per machine. Considering other factors like bandwidth fluctuation (i.e. it's not always guaranteed that a server can use the maximum bandwidth), we could say each server can serve up to 1,000 viewers. Now, what if there are more viewers than a server's capacity?

Here's how things get interesting. An obvious way to cope with this is to add "layers" to the server cluster. In the figure below, a video stream from a broadcaster is first received by a Twitch server, and then broadcasted to other in-house Twitch servers internally. Then, each server broadcasts the stream to the viewers.

This way, they could leverage one stream up to a million viewers (1000x1000). But what if there are even more users? And we're still talking about only one broadcaster. With the advent of PS4 and XBox One, there could be potentially a million broadcasters, each with different set of viewers! This is a sense of scale that Twitch is trying to manage right now. A natural answer to this would be adding more layers, leveraging the number of simultaneous viewers further.

As you noticed in the above diagram, there's always a slight delay between servers. This could be a result of network delay, or an aggregation of multiple input streams (hence "buffering"). Or, it could be an overhead of the server itself. Either way, the more layers you add, the more latency you get. The total amount of delay is the lag that a viewer can experience. We can say the lag is roughly proportional to the overall size of the whole Twitch system.

One way to mitigate this is to minimize each lag between servers, but I think this is already difficult when you have this many servers. Google, which has a far larger system than Twitch, has seemingly a worse delay with its Hangouts livecast. So this is a tricky problem after all. Remember, after all this, there's still a lot of uncertainty. They don't know exactly how the number of users will eventually be, and how each viewer moves across multiple channels, etc. And there's always a risk of network disruption. So with all things considered, they want to keep the system very flexible, so that they can adjust the system quickly to meet their sudden needs. This is a massive engineering task for them, and honestly I want to praise they had a courage to do it. Still, I wish there was less delay for each stream. Since they've already made a decision that they incorporate PS4 users, this is somewhat an inevitable change for them, and now we're all seeing its consequence.

All of this is a manifestation of a well-known problem in computer network. It's called "Throughput versus Latency". There's always a trade-off between these two factors; the more you want to deliver at one time, the more delay you'll get because you need some time/space to aggregate things (so called "buffers"). This is just an inevitable law of life. Even outside computer network, this problem can be easily shown with these two examples:

  vs.  

Cargo ships can obviously carry a huge amount of stuff at once, but they can't depart at every hour. They need some time to load/unload. That's buffering in real life.

Sudden Outrage 2013-12-21 [Sat] 10:52

Suddenly I realized that I haven't updated this page for about a month!

And I kept thinking that "oh I have to update my diary, I have to update my diary" for weeks! This is truly outrageousu.

From my TODOs:

** Mon Dec  9 23:51:17 JST 2013 **
diary

Yusuke Shinyama
Document ID: a0a343bd52aaeda18894d7794f2697a6