Gleitschirmfliegen The Game


Because of reasons, I felt that I have to materialize my ideas before they drive me crazy.
Gleitschirmfliegen The Game is one of my biggest projects in the past few years. In total it took me ~270 hours for editing, plus ~400 hours for rendering (including tests).

The development process is a mixture of learning, having fun and self torturing. Quite a few things should be worth a few notes.

Intro & Credits

Intro and Credits are the easiest and most fun sections to make. Shotcut is my video editing tool of choice, because it's free and it supports webvfx. Since last year I've been experimenting with webvfx for the telemetry overlay of my pargliding videos. There are issues here and there, but overall it's a great tool.

I tried two new video filters, Sketch and Cartoon. Sketch worked well, and I applied it to the video. Cartoon didn't.


The spiral scene is apparently the most important one. At some point I came up with this joke idea to make my paragliding video look like a game. The original idea was just to add some HUD telemetry info (like in racing games) plus some instructions (e.g. "Hold RT for right brake").

This idea was funny, but not exciting enough for me to actually take further actions. But somehow it got developed in my head by itself, and the virtual hoops just "faded in" to the scene.

This hoop thing actually came from my early imagination of the paragliding examination. I had thought that the participant has to fly through some carefully positioned hoops in order to get a pilot license. And of course I had been wondering how on earth the paragliding association could set them up. Yeah I know, I played too many games.

The following steps are involved in making the spiral scene:
  1. Determine the telemetry data (e.g. speed, altitude)
  2. Determine the position and attitude of the camera
  3. Design the visual effects
  4. Design the sound effects

1. Telemetry

This includes the map and the gauges (speed, altitude etc.) in the video. Since last year I've been developing the telemetry overlay for my paragliding videos. It took me one or two months to finish, and I'm happy with the result.

I modified the overlay for this video, which works well.

In general the whole telemetry pipeline includes:
- Modified gpmf-parser for extracting telemetry from GoPro videos.
- Modified canvas-gauges for the gauges
- Shotcut/melt for combining the video and the telemetry overlay
- Scripts to automate everything

GoPro Data
GoPro Hero 5 records GPS data, as well as accelerometer and gyroscope readings. Only GPS data are used in my overlay, including 3d position, 3d speed and 2d speed. I managed to compute acceleration based on these data, which turned out to match with the accelerometer readings quite well.

Cesium and swisstopo data
In my original overlay, there are both a 3d map and a 2d map. This requires terrain data (altitude of each point), imagery data (for 3d map) and terrain map (for 2d map).

I was quite impressed when I found all of them from swisstopo. The data is of very high quality and free for personal use. There is even a GeoAdmin API which supports Cesium. The Cesium framework provides a comprehensive set of API for map interaction, map rendering and mathematics. And it's open sourced! 

2. Camera Position and Attitude

tldr: I annotated and solved the camera attitude manually.

For VFX I need to know both the position and the attitude of the camera. The camera position is available from GPS data, however the data is only 18Hz, and the accurcay is usually 5-10 meters.

The good news is that since I'm in the sky and everything else are far away, these errors are quite small. Just some simple denoising and interpolation would make the data useable. This is actually my final solution.

The bad news is that the camera attitude is unavailable. I had thought about recovering the attitude with the accelerometer and gyroscopes, but based on some articles I read before, I don't really believe that would work.

Blender Motion Tracking & Camera Solving
I don't remember how I came across those Blender tutorial videos about motion tracking and camera solving. I was amazed by such features of Blender. From the videos it seems that motion tracking should work quite reliably. This was actually one of the most important reasons that I even started making this video.

I attempted to apply this feature on my videos, motion tracking worked well as expected, but camera solving almost always produced a large error. Occasionally I got a good solution for a few consective frames, but the error started increasing dramatically as long as I tried to extend the frame range.

Camera Calibration
Then I learned about camera distortion and calibration. After reading a few articles about camera calibartion with OpenCV, I reckoned "maybe the Blender camera calibration is not good enough, probably OpenCV will do a better job".

There's quite easy-to-use API for camera calibration in OpenCV, although I do have complains about the Python API, those "vector of vector of vector" data types are so cryptic.

I tried both the standard camera calibration and the fisheye calibration with checkboard images. The solve error is sometimes large sometimes small, but the calibrated matrix never helped in Blender.

GoPro SuperView
I also searched for calibration data of GoPro Hero 5, there are some database with Here 3,4,5 with different camera settings, especially with Normal or Wide view angles, but never with SuperView. Only after some readings I realized that SuperView is not an optical distorting at all, but rather totally artificial post-processing.

I didn't find the offical SuperView algorithm, but some relevant articles:
- https://emirchouchane.com/remap2-gopro-360/
- https://intofpv.com/t-using-free-command-line-sorcery-to-fake-superview

I tried both, actually both worked quite well, the converted video didn't look distorted at all. But unfortunately, neither helped with camera solving.

Solving Camera with Perspective-n-Point (PnP)
For usual motion tracking tasks (that I saw in those tutorial videos), the objects in the video are buildings or rooms. And it is unlikely to obtain the actually 3d coordinates of the keypoints. Otherwise the designer can directly reconstruct the scene, which makes motion tracking useless (or at least not as useful).

On the other hand, in the spiral part of the video, you see nothing but rotating landscape. It's easy to obtain the longitude, latitude and altitude of the features, e.g. corners of a building. Meanwhile since they are far away, the usual data error should be acceptable.

OpenCV provides some functions to solve the PnP problem, but unfortunately none of them worked. At that point I ran out of ideas and just gave up. So I decided to just track the camera attitude manually. But before that, I think I found that reason of all these failures:

One day I realized that video stabilization was turned on for my videos. This means that the center of the lens keeps moving across video frames. No wonder the camera solving might work for a few consecutive frames but not for more!

Maybe I can recover the stabilization data from accelerometer and gyroscope readings, but I don't think it would work well.

Manual Tracking
Inspired by the perspective-n-point problem, I realized that the camera attitude can be determined if I know both the 2d (viewport) and 3d (real world) coordinates of at least two points, since I already knew the camera position from GPS.

Because of PnP attempt, I tracked several points in the video and found their 3d coordinates. However I couldn't derived a good formula for solving the camera attitude.

My final solution was to track the 3d position of the "center" (0.5, 0.5) and "center-up" (0.5, 0.75) point of the viewport. With this info, I can compute the camera direction from the camera position and the viewport center, then "center-up" can be used to determine the up direction.

Since both points are in the center, I assume that they are not affected by lens distortion or GoPro SuperView too much.

I made a manual tracking tool with Cesium, which was surprisingly an easy task. Since the camera should be moving along a continuous track, I decided to track every other 10 frames or so. In the end I manually tracked in total ~440 points, which is not that much.
Manual Tracking Tool
I used spline functions to interpolate the data, which looks good.
Interpolated Data (red: center, yellow: center-up)
The next step is to calculate the camera attitude based on these data. Again, I was able to achieve this quickly with Cesium. By comparing the computed scene and the actual footage frame by frame, I can verify that computed camera attitude is quite accurate (despite of SuperView).

One last thing is to interpolate the camera positions. The source data is 18Hz, but I need 60Hz. I only noticed this after importing the data into Blender.

3. VFX

This is almost my first time using Blender. Well the real first time was when I downloaded Blender years ago when I heard that it got open sourced. The second time was a few months ago when I needed to paint a UV texture and convert the model format.

This would be the first time that I extensively use Blender, although I mostly used it programatically and I have not touched the modeling part.

The Blender (I'm using 2.79) UI is honestly really anti-human, but still quite fun to explore. Somehow this reminds me of first time when I saw MS paint and played with all the tools.

Python extension is another great important feature to me. Especially as the blender file format is not trivial, python extension is the only way for me to import/export data from/for external tools.

With the camera position data, it is not difficult to create all the hoops in position. Although sometimes I had to tweak them a little bit because some data points are not smooth.

There are plenty of tutorial videos out there, and I learned quite a few things:

- Particle
- Fire Effect
- Baking
- Compositing
- Rendering Optimization

Especially I found the compositing nodes really fun. Here are the compositing nodes for my video:

Foreground Background Separation (Masking)
The hoops should be rendered behind the glider, but in front of the terrain. This means that I need to separate the foreground and the background. Like this:

In the beginning I tried to find some algorithms or tools for this task. There are a few functions in OpenCV, which look similar but doesn't fit into my situation, for example, (stable) background subtraction and object identification.

The main obstacles are the rotating background and the unusual distortion (SuperView + stabilization).

In the end I decided to do this manually, which turned out to be super long and tedious.


Basically for each frame I need to mark the shape of the glider. There are 2200 frames and in total I need to add ~100k control points. I could make ~100 frames per workday, or 200~300 frames per weekend day. Although this means that I couldn't do anything else.

I decided to skip all the lines except for the brake line. I feel that the final result is acceptable. This actually saved lots of work.

It'd be more complicated if some objects should be rendered in front of the glider (e.g. particles). But I made the hoops large enough such that this situation should never happen.

4. Sound Effects

My friend Andrea kindly helped me with the voice acting. I chose the German voice because that's extra information along with the English subtitle, which should make the video less boring.

Thanks to the YouTube Audio Library, freesound.org and other websites, I was able to find many good BGM and sound effects, except for the hoops, appearing and being smashed.

I learned about this tool "LabChirp". It's a rather small piece of software and I didn't expect too much from it. Surprisingly it keeps a really good balance between easy-to-use and comprehensive features. Very some luck, I was able to find parameters of the sounds in my head.


I'd give this video a rank of B+, which is the same ranking I gave to my spiral manoeuvre shown in the video. It's probably OK to show this to my friends or to the public, but this is not the best I can do, and the long tedious labor work is not good.

Throughout the video, I paid attention to the information density of each part, it should not be too high or too low, and it should not be constant. 

Besides all the technical knowledge that I learned while making this video, I probably also improved my time/project managing skills

Removed Features 
1. Before deciding to add the hoops, I spent some time in marking the landing point and making an overlay showing the position and distance. This is similar to the "Task icons" in many games. However it does not fit with the hoops.

2. There is a 3d map in my original telemetry overlay, which shows the estimated attitude of the glider as well as the 3d track. It actually looks very good with normal videos, but here it becomes a little bit distracting and repetitive along with the hoops

3. In early versions of the loading screen (before the spiral scene) I included the Swiss Segelflugkarte and a 3d map of the flying area. But in the end I think the loading screen looks better without them.

What can be improved
1. The editing process took me really too long, there must be a way of shortening it.

2. If I had more time, I'd implement the shadow casting between the glider and the hoops

3. Roughly I made the video in the chronological order, but in the end I felt tired and unmotivated due to the long tedious masking process. One direct consequence is I lost all the exciting feeling while watching the spiral scene, because I have watched this part again and again, back and forth for hundreds of times. This is really bad for creativity. Probably it'd be better if I roughly edit the whole video at the beginning, and then improve each section individually.





前阵子同事跟我提到Google App的音乐搜索功能,虽然之前也知道这个功能,但其实都是偶然触发,并没有想搜音乐。后来去麦当劳时,觉得BGM挺好听,就用Google App搜了一下,虽然由于周围环境噪音的原因总是不能捕捉到声音,不过最后我像傻子一样把手机高举过头顶,还是一次就找到了结果:  Lianne La Havas - Lost & Found
话说到这,就刚刚我又想起另一个麦当劳放的曲子,由于有歌词,没花多久就找到了: Alex Winston - Sister Wife

前几天别人发我一个Youtube视频,我对BGM很感兴趣,也是很容易就搜到了:Dr Dre - The Next Episode。跟她提起这个音乐时,我猛然想起自己的一个“有那么一段优美的旋律,而我不知道它叫什么”的回忆。然而我已经完全不记得曲子的旋律,以及在哪里听到这个曲子了。当然一旦听到曲子我肯定能发应过来。我总觉得人的记忆是单向hash,比如四句的唐诗,给出第二句背第三句就很容易,反过来就难很多。

决定试一下再次挑战,因为我觉得如果能再次找到这个音乐,Google App八成能搜出来名字。


说到这种枚举的事情我也不是第一次做了。小时候玩的FC游戏Moon Crystal也是这种情况,后来上大学以后去梦幻岛翻遍FC游戏截图才找到。小学或者是中学时也有跑遍书店找一本语文参考书的经历。看来我不是一般的偏执。

排除了Flash游戏,于是想到Windows App,搜了一下购买历史无果。

于是想到iPad,按照difference, spot, find这类关键词搜,然后看到找不同的游戏就下载ipa然后扒开看。然后,第二个游戏就是我要找的!

原来是2012年2月我下载了一个ipad的找不同游戏,叫做 Find the Difference for iPad, 其中有一段BGM我很喜欢,用midomi/SoundHound只搜出一个结果,但是不是这首曲子,而是另一首,只是中间有几秒钟插入了这一首。扒开IPA文件找到了音乐文件,也没发现有用的信息。虽然痛苦也只能不了了之。而直到今天早些时候,我对于年份,app以及旋律都没有印象了。

兴奋之余,赶紧拿出Google App搜索,不过搜了几次给出的结果都不一样,其中大概一半完全没关系,而另一半确实是这个曲子,不过长短和名字都不一样:

Im Bett Bleiben by Weckertöne (闹铃)
Royalty Free Music Crew - I Don't Want to Leave You
Beardyman - Nothing To Undo (这个就是之前用midomi搜出来的)


Buddy (iMovie曲目)
Bobby Cole - Martini Jazz

Bobby Cole还有一个曲子叫Cocktail Sipping Jazz,听着有点相似,不过并不相同,速度也更快。

所以最后结论是这首曲子(原始)叫做Martini Jazz,乐谱还没找到正式的,倒是有Buddy听写谱,这个之后再说吧。目前已经够我兴奋一阵的了。




最近的新玩具: https://github.com/coolwanglu/quine-chameleon





- 支持双引号定义的字符串,支持反斜杠转义
- 将字符串按分隔符切成数组
- 生成随机数
- 支持命令行参数
- 数组查找
- 字符串转义,或者json输出,或者字符串/正则式替换
- 输出(不带换行)


我在写这个玩具的过程中也很自然的学习了一些新语言。之前为了写dunnet.js而研究了emacs lisp,这里还加入了其他lisp系列比如clojure,racket,发现写写脚本也还挺顺手的。只可惜scheme由于标准库太小以及srfi实现不统一没能加入。另外就是接触了之前完全不想接触的perl和awk,发现不但没有想像中的可怕,反而挺好用的。





NetHack 网页移植






NetHack也有各种不错的移植,比如Steam上的Vulture for NetHack。有如此强大的游戏核心,再辅以美观的画面和易用的操作,真是绝了!


Studying Metal Slug's Engine

There's a Debug Menu in many titles from the Metal Slug series. Today I played around with that debug menu in Metal Slug X, especially I turned on 'body rect' and 'attack rect', which revealed how the game works. Now not only did I have tons of hours of fun, I also studied a lot from this amazing game!

Of course the following are only my observations, or my best guess on how they have been implemented the game. Although I think my interpretation should work, but it's definitely not the only way, and it's quite possible that the developers have achieved the same effect with another (maybe better) method.

First Impression

At the first glance, the debug drawing is far cleaner than I had expected, it seems that it's not tiled based physics at all. But note the thin shadowed area below the line, which may imply that it's still using tiled based calculation somehow.


The ground is described as line segments, from the way the shadow is drawn I think that those are directed edges, so we know that on which side shall we keep things. Actually I think that all the edges are half-platforms, will talk about that below.

 In this picture, an obstacle is mark with 3 line segments, clever & cute! 


Besides the edges, there are two types of rectangles, 'body rects' (marked with a pair of right angles) and 'attack rects' (marked with a pair of arrows).

All the rectangles are axis aligned, even if they disagree with the visual. The missiles in the above pictures are heading toward different directions, but their body rects are still AA.  Sometimes we need to approximate the shape with multiple rectangles, for example in the pictures below.

Two body rects for the truck

Note the attack rects of the missles
Note the attack rects on the boss, marking the effective area of the spikes.

My interpretation of the rectangles is that they are using for hitting detection. Attack rects mark the effective area of attacking, and body rects mark the effective area of receiving damages.
About the attack rect of the player, I think it marks the melee range.


Objects in Metal Slug can be roughly categorized into the following groups:

- Terrain: ground, slopes, platforms
- Mobs: infantry, vehicles, bullet, projectiles
- Attack areas: logical areas that may cause damage

So let's try to talk about all possible ways of collidings.

Terrain vs Mobs

Terrain collide with mobs only.  Since terrain is mostly marked as line segments, for each mob it is easy to find out which point on the ground is supporting it. One way of doing this is by casting a gravity direction ray right below the mob and find the first intersection point with the terrain.

How to find the intersection then? I guess the game is using the line sweep algorithm from left to right. A clue is that there are never vertical edges, which do not have upper side and lower side. Whenever the game wants to setup vertical obstacles, unit squires are used, as in below: 

The stacked squares are used to block the player. I think that it may be easier to implement this way than vertical line segments. Also I think the squares are not static body rects, why? see below.

Sometimes multiple squares are used, not sure why, maybe to avoid bugs? Also note that the body rect of the player is colliding with the them, so this proves that body rects are not used for terrain colliding detection. Instead I think that the foot point (don't know the actual term, I just mean the mid-point between the feet) is used.


The entrance of a cave, the line segments with squares around them are reversed platforms, they prevent user going up but not going down. This picture also proves that the foot point is used.

More examples:


A door.


There are still some questions left:
 - Are the line segments represented by two endpoints or tiles? Both might be possible, but tiles may be more consistent with the squares
 - Why using all those reversed platforms inside (the door and the cave) entrace? I think it's enough to use the lowest one only. One possible answer is for robust.

Finally, the joints:

When a slope intersects with horizontal edges, special care must be taken, as marked in the screenshot. Interesting.

Attack Areas vs Mobs

Attack areas only collide with mobs. It's pretty straightforward,  if a mob collides with an attack rect, it may receive damage - need more logic checking, e.g. friendly fire. There are also other cases like saving hostages. It's too intuitive to put more words here.

Mobs vs Mobs

In Metal Slug, usually the player doesn't collide with enemy infanty. But there are exceptions:

This is the rightmost position I can reach from the left side of mummies. Note that I can jump into them from above, but will get pushed away gently.

This is the leftmost position I can reach from the right side of zombie dogs. Note that in this case if I try to jump onto them, I will bounce up instead of fall through.

With tanks it seems to be more complicated, it behaves differently when I try to approach from left and right. I didn't figure out if it's using the foot point of the body rect. There could be another type of rectangle, or there's an offset on the x axis.


So these are what I've got so far. I don't know why I haven't though of this idea before.

The using of directed edges and foot point is quite effective, since it's point-line segment intersections, which is much faster than polygon intersectons, or even AABB. Tunneling effects can be detected this way. And jump platforms are natural to implement.

On the other hand, this design justifies for Metal Slug, but may not for other games. Because the body rects may collide with the terrain, which may look weird in pure tiled based games like Super Mario. In Metal Slug, however, it's ok, since the background is not tiled, and we can see 'side faces' of '3d objects'.

Metal Slug has always been one of my top favorite games. Now there's a new way to play with more fun! I'd like to play through the series again with all the rectangles turned on!

Rocket lawn-cher!
Enemeee chaser!
Mission Complete!



surface pro stylus calibration

Search for 'UserLinearityData' in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet and remove 'devicekind=pen'

merge tool: meld

linux虚拟机的硬盘大小不够,扩容以后又调整了分区,于是grub就不能进入了。启动说unknown filesystem然后进入grub rescue。

在grub rescue输入

set prefix=(hd0,5)/boot/grub
insmod linux
linux (hd0,5)/boot/vmlinuz-3.11.0-15-generic
initrd (hd0,5)/boot/initrd.img-3.11.0-15-generic




Converting synced C/C++ into asynced JavaScript

Emscripten is a C/C++ -> LLVM -> JavaScript compiler. It's useful and interesting, but one of its biggest limitations is about the sync/async model: JavaScript is single-threaded event-driven languages, therefore you cannot do a sleep() in the hope of receiving and processing extra events.


First of all, please enjoy this demo, on which I really spent some time.


  • Motivation: to make `sleep` and other similar functions actually work in our Runtime, without lots of labor work porting the C/C++ code
  • I actually made it worked
  •  `sync->async` is doable and I think it's not too hard - there are already implementations for JS
  • I think this is a feature that cannot be supported by merely an external library
  • Does it fit emscripten? Or is it possible to write a plugin for emscripten?

Demo explained(?)

The demo is basically a stop-time animation:

//draw something
//wait for some while
//draw something else

You can verify this in the source code.

We all know that it is hard to convert it into the async model of JavaScript. Now would you please take a look at my ported code, it's almost identical to the original code, except for:
  • A few macros are introduced, which are defined in asyn2.h — actually the macros are pretty much just placeholders. 
  • A js function js_napms defined  — which is a wrapper of setTimeout

I fed emscripten with the code and blablabla — and the demo works. But wait! The code should be identical to the original code, which is synced!
Well, please let me explain a few more things before I reveal the secrets.

Another Demo

Here's another demo, which is... the same as above. So what's the deal?

We may imagine that, to really 'go to sleep', we need to store all the context and restore it when we come back again. Indeed, I did so in the source code, whenever you see a `ASYNC_` macro, it involves pushing and poping to maintain the async stack.

The actual functions behind those macros are defined in async.h.

Well, I'm NOT going to propose a set of API or a library, instead I'm proposing a way of pre-processing the code, and I did that myself manually. It's doable and there're patterns, you may see how a for-loop is broken down according to the comments. I'll put technical details in the end.

The porting experience may not be as smooth as it looks like, actually `xmas` is rather straightforward, where there are rarely recursive for-loops or branches. But if you take a look at other demos, it is a nightmare to define callbacks and maintain the stack manually, just imagine that there's no `call` and `ret ASM macros, and you have to do `push`, `pop` and `jump` manually.

My point is that: the sync->async process can, and should be done by the pre-processor/compiler

The Secrets of the 1st Demo

You didn't skip the previous section did you?

Actually I made the second demo at first, before I knew the the secret weapon — streamlinejs, and here is an intuitive demo.

It's not a library, but a parser/compiler instead. I didn't go too deep into its mechanism, but from the results it generated, the mechanism should be similar as what I'll mentioned below. You may read  this article for more details.

To build the first demo, all the placeholders are replace with underscores, which will be recognized by streamlinejs (as placeholders for calback), fortunately un-optimized JS generated by emscripten can be parsed without any problem — at lesat my demo.

Technical stuffs

Imagine that there a stack dedicated for async function calls, it is different from traditional stacks in that this stack is not cleared when a function exits.

Async function calls are different from (normal) sync funtion calls, an async call pushes the context into the async stack, including the callback (similar as the return address in the synced case) and returns. The central event dispatcher (the JS engine in our case) will call the callback eventually.

So the central idea is to identify all the async function calls, which are usually casuse by two reasons:

  • Calling an async function
  • `jump` over an async call

The first one should be easy: some functions are async natively, e.g. `SDL_Delay`. And if a function calls any other async funtions inside, it is async.

The second one is usually originated from loops and branches, which will be explained later.

I think that these can be identified by the compiler, in one of following stages:

- Pre-processing C/C++ — I did that manually myself
- LLVM bitcode — which I'm not so sure
- JavaScript — streamline itself is an example

There are advantages and disadvantages in different stages, for example it might be easier to optimize the code when parsing the C code; while it may be more light-weighted to store the local variables using JavaScript closures.

Identify and transfrom async functions

Here's an example:

// sync version
void work()
    int j = 99;
    printf("result %d\n", j);

Since SDL_Delay is natively async, we have to transform `work` into its async counterpart, as follows:

// async version
// context: stack for async calls
int work(context_t * context)
    int j = 99;

    push_callback(context, work__cb1);  // set up callback
    put_variable(context, j); // save local variables

    SDL_Delay(1000, context); // async version of SDL_Delay
    return 0; // get out and wait for SDL_Delay
int work__cb1(context_t * context)
    get_variable(context, j);
    pop(context);  // prepare to return the previous chained callback
    printf("result %d\n", j);

For-loops make the situation more complicated, which causes another type of async calls:

int f()
    for(int i = 0; i < 10; ++i)
        printf("hi ");
        printf("%d\n", i);

f() can be flattened as

int f()
   int i = 0;
   if(i >= 10) goto end;
   printf("%d\n", i);
   ++ i;
   goto start;
   // nothing

Now it is clear that we can split the function and make async calls

int f(context)
  int i = 0;
  // save i to the stack
  // async call f_start();
int f_start(context)
  // restore i
  //pop stack

  if(i >= 10) // async call f_end();

  printf("hi ");

  // save i
  // push f_start2 into the stack
  SDL_Delay(10, context);
  return 0;
int f_start2(context)
   // restore i
   //pop stack

   printf("%d\n", i);

  // push i
  // async call f_start() to continue the loop
   return 0;
int f_end(context)
   // pop stack
   // async call callback of f()

Braches (if, switch etc)  are similar, as long as we consider them as `goto`'s.

local variables and return values

local variables may be stored and retrieved when we push/pop the async stack,
and so are return values.

Compiler/Preprocessor Integration: Step 1

It should be clear now that this feature is kind of transformation, which cannot be supported by linking to an external library. Of course the pre-condition is that the transformation should be (almost) transparent, it should not be necessary for developers to maintain the stack manually.

The first step, I'd imagine, is that the async functions are explicitly marked through some mechanism. In my example, a placeholder is used.

Developers may still write programs in the sync fashion, for two reasons: one for the convenience writing new program, and the other for porting existing ones.

The compiler should detect, split and setup async functions automatically, the async stack should be managed by standard library while some API might be exposed.

There are two ways  of managing the local variables, let me call them the C style and the JavaScript style:

The C style: Local variables of async functions are stored in dedicated area in the memory (HEAP or a special stack for async functions), instead of the normal stack. To avoid lots of memcpy's, the variables may be directly allocate there. Some push/pop operations may be optimized if the caller/callee is known (e.g. loops/branches)

The JavaScript style: streamlinejs is a good example. Async functions are broken into a series of resursive functions, and local variables are stored into the closures.

The JavaScript style is easy and intuitive, but the hidden overhead might not be negligible. It may be too late to optimize when the LLVM bitcode have been transformed into JavaScript.

Compiler/Preprocessor Integration: Step 2

It might be possible to further reduce the work of writing/porting, as even marking async functions and define the placeholders for every async function declaration and every async function call is boring and error-prone.

My (naive & wild) imagination is that by defining a few essential async functions (such as SDL_Delay), the compiler would automatically recognize async functions, and set up the hidden parameter. It's not perfect, especially when we need to link a number libraries, but at least I think a C/C++ transformer would be possible and nice, perhaps based on LLVM?


  • It might not work for muti-threading. Indeed I've been only thinking about single-threaded programs, especially most ones for terminal — But this should not affect the importance of this issue I think.
  • Lots of overhead might be introduced in this way — But I guess the performance should not be affected much if well optimized
  • C++: ctr/copy/dectr  of objects might be a problem, or maybe not since they can be `flattened` into C-style?
  • C++: try-catch might not work, the control flow is already diverted
  • There a few limitations of streamlinejs, but I think many of them can be addressed if we process in the C phase.


Please follow this thread on GitHub.