~rom1v/blog2023-03-14T16:40:05+01:00https://blog.rom1v.com/Romain Vimontrom@rom1v.comScrcpy 2.0, with audio2023-03-12T02:30:00+01:00https://blog.rom1v.com/2023/03/scrcpy-2-0-with-audio<p>I am thrilled to announce the release of <a href="https://github.com/Genymobile/scrcpy">scrcpy 2.0</a>. Now, you can
mirror (and record) your Android 11+ devices in real-time with audio forwarded
to your computer!</p>
<p>This new version also includes an option to select the video and audio codecs.
The device screen can now be encoded in H.265, or even AV1 if your device
supports AV1 encoding (though this is unlikely).</p>
<p>The application is free and open source. Follow the <a href="https://github.com/Genymobile/scrcpy/blob/master/README.md#get-the-app">instructions</a>
to install it and run it on your computer.</p>
<p><em>If you like scrcpy, you can <a href="/about/#support-my-open-source-work">support my open source work</a>.</em></p>
<p class="center"><a href="https://github.com/Genymobile/scrcpy"><img src="/assets/scrcpy2/icon.png" alt="scrcpy" /></a></p>
<h2 id="audio-usage">Audio usage</h2>
<p>Audio forwarding is supported for devices with Android 11 or higher, and it is
enabled by default:</p>
<ul>
<li>For <strong>Android 12 or newer</strong>, it works out-of-the-box.</li>
<li>For <strong>Android 11</strong>, you’ll need to ensure that the device screen is unlocked
when starting scrcpy. A fake popup will briefly appear to make the system
think that the shell app is in the foreground. Without this, audio capture
will fail.</li>
<li>For <strong>Android 10 or earlier</strong>, audio cannot be captured and is automatically
disabled.</li>
</ul>
<p>You can disable audio with:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scrcpy --no-audio
</code></pre></div></div>
<p>If audio is enabled, it is also recorded:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scrcpy --record=file.mkv
</code></pre></div></div>
<p>Unlike video, audio requires some buffering even in real-time. The buffer size
needs to be small enough to maintain acceptable latency, but large enough to
minimize buffer underrun, which causes audio glitches. The default buffer
size is set to 50ms, but it can be adjusted:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scrcpy --audio-buffer=30
</code></pre></div></div>
<p>To improve playback smoothness, you may deliberately increase the latency:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scrcpy --audio-buffer=200
</code></pre></div></div>
<p>This is useful, for example, to project your personal videos on a bigger screen:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scrcpy --video-codec=h265 --display-buffer=200 --audio-buffer=200
</code></pre></div></div>
<p>You can also select the audio codec and bit rate (default is <a href="https://en.wikipedia.org/wiki/Opus_(audio_format)">Opus</a> at 128Kbps).
As a side note, I’m particularly impressed by the Opus codec at very low
bit rate:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scrcpy <span class="nt">--audio-codec</span><span class="o">=</span>opus <span class="nt">--audio-bit-rate</span><span class="o">=</span>16k
scrcpy <span class="nt">--audio-codec</span><span class="o">=</span>aac <span class="nt">--audio-bit-rate</span><span class="o">=</span>16k
</code></pre></div></div>
<p>See the <a href="https://github.com/Genymobile/scrcpy/blob/master/doc/audio.md">audio documentation</a> page for more details.</p>
<h2 id="history">History</h2>
<p>The first version of scrcpy was released <a href="/2018/03/introducing-scrcpy/">5 years ago</a>. Since then,
audio forwarding has been one of the most requested features (see <a href="https://github.com/Genymobile/scrcpy/issues/14">issue #14</a>).</p>
<p>I made a first experimentation and developed <a href="/2019/06/introducing-usbaudio/">USBaudio</a> as a
solution, but it worked poorly and the feature it relied on was deprecated in
Android 8.</p>
<p>With the introduction of a new API to capture audio from an Android app in
Android 10, I made a prototype called <a href="/2020/06/audio-forwarding-on-android-10/">sndcpy</a>. However, there were
several issues. Firstly, it required to be invoked from an Android app (the
scrcpy server is not an Android app, but <a href="/2018/03/introducing-scrcpy/#run-a-java-main-on-android">a Java executable run with shell
permissions</a>). Most importantly, this API <a href="https://developer.android.com/guide/topics/media/av-capture#capture_policy">lets apps
decide</a> whether they can be captured or not, meaning many apps
simply could not be captured, causing confusion for users.</p>
<p>By the end of January, <a href="https://github.com/yume-chan"><strong>@yume-chan</strong></a> (a <em>scrcpy</em> user), <a href="https://github.com/Genymobile/scrcpy/pull/3703">provided
a proof-of-concept</a> to capture the device audio with <em>shell</em> permissions
and also proposed a working workaround for Android 11.</p>
<p>Since then, I <a href="https://github.com/Genymobile/scrcpy/pull/3757">have been working</a> on a proper integration into scrcpy
<em>(my evenings and weekends have been quite busy</em> 🙂<em>)</em>. I added encoding,
recording, buffering and playback with clock drift compensation to prevent audio
delay from drifting.</p>
<p>Below are more technical details.</p>
<h2 id="audio-capture">Audio capture</h2>
<p>On the device, audio is captured by an <a href="https://developer.android.com/reference/android/media/AudioRecord"><code class="language-plaintext highlighter-rouge">AudioRecord</code></a> with <a href="https://developer.android.com/reference/android/media/MediaRecorder.AudioSource#REMOTE_SUBMIX"><code class="language-plaintext highlighter-rouge">REMOTE_SUBMIX</code></a> as
the audio source.</p>
<p>The API is straightforward to use, but not very low-latency friendly. It is
possible to <a href="https://developer.android.com/reference/android/media/AudioRecord#read(java.nio.ByteBuffer,%20int,%20int)">read</a> a number of requested bytes in one of two
modes:</p>
<ul>
<li><a href="https://developer.android.com/reference/android/media/AudioRecord#READ_BLOCKING"><code class="language-plaintext highlighter-rouge">READ_BLOCKING</code></a>: the read will block until <strong>all</strong> the requested data is
read (it should be called <code class="language-plaintext highlighter-rouge">READ_ALL_BLOCKING</code>).</li>
<li><a href="https://developer.android.com/reference/android/media/AudioRecord#READ_NON_BLOCKING"><code class="language-plaintext highlighter-rouge">READ_NON_BLOCKING</code></a>: the read will return immediately after reading as
much audio data as possible without blocking.</li>
</ul>
<p>However, the most useful mode, which is a blocking read that may return less
data than requested (like the <a href="https://man7.org/linux/man-pages/man2/read.2.html"><code class="language-plaintext highlighter-rouge">read()</code></a> system call), is missing.</p>
<p>Since the amount of data available is unknown beforehand, in <code class="language-plaintext highlighter-rouge">READ_BLOCKING</code>
mode, scrcpy might wait for too long. Conversely, in <code class="language-plaintext highlighter-rouge">READ_NON_BLOCKING</code> mode,
scrcpy would read in a live-loop, burning CPU while the function returns 0 most
of the time.</p>
<p>I decided to use <code class="language-plaintext highlighter-rouge">READ_BLOCKING</code> with a size of 5ms (960 bytes).</p>
<p>Anyway, in practice, on the devices I tested on, audio blocks are produced only
every 20ms, introducing a latency of 20ms. This is not a limiting factor though,
since default OPUS and AAC encoders implementations on Android produce frame
sizes of 960 samples (20ms) and 1024 samples (21.33ms) respectively (and they
are not configurable).</p>
<p>In these conditions, <em>scrcpy</em> reads successively 4 blocks of 5 ms every 20ms.
Although the number of requested bytes could be increased to 20ms (3840 bytes),
in theory some devices might capture audio faster.</p>
<p>With the missing blocking mode (<code class="language-plaintext highlighter-rouge">READ_BLOCKING_THE_REAL_ONE</code>), it would be
possible to request a read with a larger buffer (e.g. 500ms) in one call, and
the <code class="language-plaintext highlighter-rouge">AudioRecord</code> would return as much data as possible whenever it is
available.</p>
<h2 id="audio-encoding">Audio encoding</h2>
<p>The captured audio samples are then encoded by <a href="https://developer.android.com/reference/android/media/MediaCodec"><code class="language-plaintext highlighter-rouge">MediaCodec</code></a>, which offers both
<a href="https://developer.android.com/reference/android/media/MediaCodec#synchronous-processing-using-buffers">synchronous</a> and <a href="https://developer.android.com/reference/android/media/MediaCodec#asynchronous-processing-using-buffers">asynchronous</a> APIs.</p>
<p>For our purpose, we need to execute two actions in parallel:</p>
<ul>
<li>sending input audio buffers (read by our <code class="language-plaintext highlighter-rouge">AudioRecord</code>);</li>
<li>receiving output audio buffers (the encoded packets).</li>
</ul>
<p>Therefore, the asynchronous API is more suitable than the synchronous one.</p>
<p>Here is how it is <a href="https://developer.android.com/reference/android/media/MediaCodec#asynchronous-processing-using-buffers">documented</a>:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">MediaCodec</span> <span class="n">codec</span> <span class="o">=</span> <span class="nc">MediaCodec</span><span class="o">.</span><span class="na">createByCodecName</span><span class="o">(</span><span class="n">name</span><span class="o">);</span>
<span class="n">codec</span><span class="o">.</span><span class="na">setCallback</span><span class="o">(</span><span class="k">new</span> <span class="nc">MediaCodec</span><span class="o">.</span><span class="na">Callback</span><span class="o">()</span> <span class="o">{</span>
<span class="nd">@Override</span>
<span class="kt">void</span> <span class="nf">onInputBufferAvailable</span><span class="o">(</span><span class="nc">MediaCodec</span> <span class="n">mc</span><span class="o">,</span> <span class="kt">int</span> <span class="n">inputBufferId</span><span class="o">)</span> <span class="o">{</span>
<span class="nc">ByteBuffer</span> <span class="n">inputBuffer</span> <span class="o">=</span> <span class="n">codec</span><span class="o">.</span><span class="na">getInputBuffer</span><span class="o">(</span><span class="n">inputBufferId</span><span class="o">);</span>
<span class="c1">// fill inputBuffer with valid data</span>
<span class="err">…</span>
<span class="n">codec</span><span class="o">.</span><span class="na">queueInputBuffer</span><span class="o">(</span><span class="n">inputBufferId</span><span class="o">,</span> <span class="err">…</span><span class="o">);</span>
<span class="o">}</span>
<span class="nd">@Override</span>
<span class="kt">void</span> <span class="nf">onOutputBufferAvailable</span><span class="o">(</span><span class="nc">MediaCodec</span> <span class="n">mc</span><span class="o">,</span> <span class="kt">int</span> <span class="n">outputBufferId</span><span class="o">,</span> <span class="err">…</span><span class="o">)</span> <span class="o">{</span>
<span class="nc">ByteBuffer</span> <span class="n">outputBuffer</span> <span class="o">=</span> <span class="n">codec</span><span class="o">.</span><span class="na">getOutputBuffer</span><span class="o">(</span><span class="n">outputBufferId</span><span class="o">);</span>
<span class="c1">// outputBuffer is ready to be processed or rendered.</span>
<span class="err">…</span>
<span class="n">codec</span><span class="o">.</span><span class="na">releaseOutputBuffer</span><span class="o">(</span><span class="n">outputBufferId</span><span class="o">,</span> <span class="err">…</span><span class="o">);</span>
<span class="o">}</span>
<span class="err">…</span>
<span class="o">}</span>
</code></pre></div></div>
<p>However, there is a catch: the callbacks (<code class="language-plaintext highlighter-rouge">onInputBufferAvailable()</code> and
<code class="language-plaintext highlighter-rouge">onOutputBufferAvailable()</code>) are called from the same thread and cannot run in
parallel.</p>
<p>Filling an input buffer requires a blocking call to read from the <code class="language-plaintext highlighter-rouge">AudioRecord</code>,
while processing the output buffers involves a blocking call to send the data to
the client over a socket.</p>
<p>If we were to process the buffers directly from the callbacks, the processing
of an output buffer would be delayed until the blocking call to
<code class="language-plaintext highlighter-rouge">AudioRecord.read()</code> completes (which may be up to 20ms as described in the
previous section). This would result in additional latency.</p>
<p>To address this issue, the callback only submits tasks to input and output
queues, which are processed by dedicated threads:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// simplified</span>
<span class="n">codec</span><span class="o">.</span><span class="na">setCallback</span><span class="o">(</span><span class="k">new</span> <span class="nc">MediaCodec</span><span class="o">.</span><span class="na">Callback</span><span class="o">()</span> <span class="o">{</span>
<span class="nd">@Override</span>
<span class="kt">void</span> <span class="nf">onInputBufferAvailable</span><span class="o">(</span><span class="nc">MediaCodec</span> <span class="n">mc</span><span class="o">,</span> <span class="kt">int</span> <span class="n">inputBufferId</span><span class="o">)</span> <span class="o">{</span>
<span class="n">inputTasks</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="k">new</span> <span class="nc">InputTask</span><span class="o">(</span><span class="n">index</span><span class="o">));</span>
<span class="o">}</span>
<span class="nd">@Override</span>
<span class="kt">void</span> <span class="nf">onOutputBufferAvailable</span><span class="o">(</span><span class="nc">MediaCodec</span> <span class="n">mc</span><span class="o">,</span> <span class="kt">int</span> <span class="n">outputBufferId</span><span class="o">,</span>
<span class="nc">MediaCodec</span><span class="o">.</span><span class="na">BufferInfo</span> <span class="n">bufferInfo</span><span class="o">)</span> <span class="o">{</span>
<span class="n">outputTasks</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="k">new</span> <span class="nc">OutputTask</span><span class="o">(</span><span class="n">index</span><span class="o">,</span> <span class="n">bufferInfo</span><span class="o">);</span>
<span class="o">}</span>
<span class="err">…</span>
<span class="o">}</span>
</code></pre></div></div>
<h2 id="client-architecture">Client architecture</h2>
<p>Here is an overview of the client architecture for the video and audio streams:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> V4L2 sink
/
decoder
/ \
VIDEO -------------> demuxer display
\
recorder
/
AUDIO -------------> demuxer
\
decoder --- audio player
</code></pre></div></div>
<p>The video and audio are captured and encoded on the device, and the resulting
packets are sent via separate sockets over an adb tunnel using a custom
protocol. This protocol transmits the raw encoded packets with packet headers
that provide early information about packet boundaries (useful to <a href="https://github.com/Genymobile/scrcpy/pull/646">reduce video
latency</a>) and <a href="https://en.wikipedia.org/wiki/Presentation_timestamp">PTS</a> (used for recording).</p>
<p>Video and audio streams are then <em>demuxed</em> into <a href="https://ffmpeg.org/doxygen/6.0/structAVPacket.html">packets</a> by a
<code class="language-plaintext highlighter-rouge">demuxer</code>.</p>
<p>If <a href="https://github.com/Genymobile/scrcpy/blob/master/doc/recording.md">recording</a> is enabled, the <code class="language-plaintext highlighter-rouge">recorder</code> asynchronously <em>muxes</em> the elementary
streams into MP4 or MKV. Thus, the packets are encoded on the device side, but
muxed on the client side (it’s the division of labour!).</p>
<p>If a <a href="https://github.com/Genymobile/scrcpy/blob/master/doc/video.md#no-display">display</a> or <a href="https://github.com/Genymobile/scrcpy/blob/master/doc/v4l2.md">V4L2</a> is enabled, then the video <em>packets</em> must be decoded by
a <code class="language-plaintext highlighter-rouge">decoder</code> into video <a href="https://ffmpeg.org/doxygen/6.0/structAVFrame.html">frames</a> to be displayed or sent to V4L2.</p>
<p>If <a href="https://github.com/Genymobile/scrcpy/blob/master/doc/audio.md">audio</a> playback is enabled (currently when a display is enabled), the
audio packets are decoded into audio frames (blocks of samples) and played by
the audio player.</p>
<h2 id="audio-player">Audio player</h2>
<p>This is the last component I implemented (I wrote recording before playback),
because it is the trickiest, especially to compensate for the following:</p>
<ul>
<li><strong>clock offset</strong>: the audio output might not start precisely when necessary to
play the samples at the right time;</li>
<li><strong>clock drift</strong>: the device clock and the client clock may not advance
at precisely the same rate;</li>
<li><strong>buffer underrun</strong>: when the player has no samples available when requested
by the audio output.</li>
</ul>
<p>While scrcpy displays the latest received video frame without buffering, this
isn’t possible for audio. Playing the latest received audio sample would be
meaningless.</p>
<p>As input, the player regularly receives <a href="https://ffmpeg.org/doxygen/6.0/structAVFrame.html"><code class="language-plaintext highlighter-rouge">AVFrame</code></a>s of decoded audio
samples. As output, a callback regularly requests audio samples to be played. In
between, an audio buffer stores produced samples that have yet to be consumed.</p>
<p>The player aims to feed the audio output with as little latency as possible
while avoiding buffer underrun. To achieve this, it attempts to maintain the
average buffering (the number of samples present in the buffer) around a target
value. If this target buffering is too low, then buffer underrun will occur
frequently. If it is too high, then latency becomes unacceptable. This target
value is configured using the scrcpy option
<a href="https://github.com/Genymobile/scrcpy/blob/master/doc/audio.md#buffering"><code class="language-plaintext highlighter-rouge">--audio-buffer</code></a>.</p>
<p>The playback relies only on buffer filling, the <a href="https://en.wikipedia.org/wiki/Presentation_timestamp">PTS</a> are not used at all by the
audio player (just as they are not used for video mirroring, unless <a href="https://github.com/Genymobile/scrcpy/pull/2464">video
buffering</a> is enabled). PTS are only used for recording.</p>
<p>The player cannot adjust the sample input rate (it receives samples produced
in real-time) or the sample output rate (it must provide samples as requested
by the audio output callback). Therefore, it may only apply compensation by
resampling (converting <em>m</em> input samples to <em>n</em> output samples).</p>
<p>The compensation itself is applied by <a href="https://ffmpeg.org/doxygen/6.0/group__lswr.html#details">swresample</a> (FFmpeg). It is configured
using <a href="https://ffmpeg.org/doxygen/6.0/group__lswr.html#gab7f21690522b85d7757e13fa9853d4d8"><code class="language-plaintext highlighter-rouge">swr_set_compensation()</code></a>. An important work for the player is to
estimate the compensation value regularly and apply it.</p>
<p>The estimated buffering level is the result of averaging the “natural” buffering
(samples are produced and consumed by blocks, so it must be smoothed), and
making instant adjustments resulting of its own actions (explicit compensation
and silence insertion on underflow), which are not smoothed.</p>
<p>Buffer underflow events can occur when packets arrive too late. In that case,
the player inserts silence. Once the packets finally arrive (late), one strategy
could be to drop the samples that were replaced by silence, in order to keep a
minimal latency. However, dropping samples in case of buffer underflow is
inadvisable, as it would temporarily increase the underflow even more and cause
very noticeable audio glitches.</p>
<p>Therefore, the player doesn’t drop any sample on underflow. The compensation
mechanism will absorb the delay introduced by the inserted silence.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I’m delighted that scrcpy now supports audio forwarding after much effort.</p>
<p>While I expect that the audio player will require some fine-tuning in the future
to better handle edge cases, it currently performs quite well.</p>
<p>I would like to extend a huge thank you to <strong>@yume-chan</strong> for his initial
proof-of-concept, which made this feature possible.</p>
<p>Happy mirroring!</p>
Audio forwarding on Android 102020-06-09T21:50:00+02:00https://blog.rom1v.com/2020/06/audio-forwarding-on-android-10<p>Audio forwarding is one of the most requested features in <a href="/2018/03/introducing-scrcpy/">scrcpy</a> (see <a href="https://github.com/Genymobile/scrcpy/issues/14">issue
#14</a>).</p>
<p>Last year, I published a small experiment (<a href="/2019/06/introducing-usbaudio/">USBaudio</a>) to forward audio over
USB, using the <a href="https://source.android.com/devices/accessories/aoa2">AOA2</a> protocol. Unfortunately, this Android feature was
unreliable, and has been deprecated in Android 8.</p>
<p>Here is a new tool I developed to play the device audio output on the computer,
using the <a href="https://developer.android.com/guide/topics/media/playback-capture">Playback Capture API</a> introduced in Android 10: <a href="https://github.com/rom1v/sndcpy"><code class="language-plaintext highlighter-rouge">sndcpy</code></a>.</p>
<p><em>The name follows the same logic:</em></p>
<ul>
<li><em><code class="language-plaintext highlighter-rouge">strcpy</code>: <strong>str</strong>ing <strong>c</strong>o<strong>py</strong></em></li>
<li><em><code class="language-plaintext highlighter-rouge">scrcpy</code>: <strong>scr</strong>een <strong>c</strong>o<strong>py</strong></em></li>
<li><em><code class="language-plaintext highlighter-rouge">sndcpy</code>: <strong>s</strong>ou<strong>nd</strong> <strong>c</strong>o<strong>py</strong></em></li>
</ul>
<p>This is a quick proof-of-concept, composed of:</p>
<ul>
<li>an Android application, which captures and streams the device audio over a
socket:</li>
<li>a shell script, which starts the app and runs VLC to play the audio stream.</li>
</ul>
<p>The long-term goal is to implement this feature properly in <code class="language-plaintext highlighter-rouge">scrcpy</code>.</p>
<h2 id="how-to-use-sndcpy">How to use sndcpy</h2>
<p>You could either <a href="https://github.com/rom1v/sndcpy/blob/master/README.md#get-the-app">download the release</a> or <a href="https://github.com/rom1v/sndcpy/blob/master/BUILD.md">build the app</a>.</p>
<p><a href="https://www.videolan.org/">VLC</a> must be installed on the computer.</p>
<p>Plug an Android 10 device with USB debugging enabled, and execute:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">./sndcpy</code></pre></figure>
<p>If several devices are connected (listed by <code class="language-plaintext highlighter-rouge">adb devices</code>):</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">./sndcpy <serial> <span class="c"># replace <serial> by the device serial</span></code></pre></figure>
<p><em>(omit <code class="language-plaintext highlighter-rouge">./</code> on Windows)</em></p>
<p>It will install the app on the device, and request permission to start audio
capture:</p>
<p class="center"><img src="/assets/sndcpy/request.png" alt="request" /></p>
<p>Once you clicked on <em>START NOW</em>, press <em>Enter</em> in the console to start playing
on the computer. Press <code class="language-plaintext highlighter-rouge">Ctrl</code>+<code class="language-plaintext highlighter-rouge">c</code> in the terminal to stop (except on Windows,
just disconnect the device or stop capture from the device notifications).</p>
<p>The sound continues to be played on the device. The volume can be adjusted
independently on the device and on the computer.</p>
<h2 id="apps-restrictions">Apps restrictions</h2>
<p><code class="language-plaintext highlighter-rouge">sndcpy</code> may only forward audio from apps which do not <a href="https://developer.android.com/guide/topics/media/playback-capture#allowing_playback_capture">prevent audio
capture</a>. The rules are detailed in <a href="https://developer.android.com/guide/topics/media/playback-capture#capture_policy">§capture policy</a>:</p>
<blockquote>
<ul>
<li>By default, apps that target versions up to and including to Android 9.0 do
not permit playback capture. To enable it, include
<code class="language-plaintext highlighter-rouge">android:allowAudioPlaybackCapture="true"</code> in the app’s <code class="language-plaintext highlighter-rouge">manifest.xml</code> file.</li>
<li>By default, apps that target Android 10 (API level 29) or higher allow their
audio to be captured. To disable playback capture, include
<code class="language-plaintext highlighter-rouge">android:allowAudioPlaybackCapture="false"</code> in the app’s <code class="language-plaintext highlighter-rouge">manifest.xml</code>
file.</li>
</ul>
</blockquote>
<p>So some apps might need to be updated to support audio capture.</p>
<h2 id="integration-in-scrcpy">Integration in scrcpy</h2>
<p>Ideally, I would like <code class="language-plaintext highlighter-rouge">scrcpy</code> to support audio forwarding directly. However,
this will require quite a lot of work.</p>
<p>In particular, <em>scrcpy</em> does not use an Android app (<a href="https://github.com/Genymobile/scrcpy/issues/14#issuecomment-575920604">required</a> for capturing
audio), it currently only runs a <a href="/2018/03/introducing-scrcpy/#run-a-java-main-on-android">Java main</a> as <em>shell</em> (required
to inject events and capture the screen without asking).</p>
<p>And it will require to implement audio playback (done by VLC in this
PoC), but also audio recording (for <code class="language-plaintext highlighter-rouge">scrcpy --record file.mkv</code>), encoding and
decoding to transmit a compressed stream, handle audio-video synchronization…</p>
<p>Since I develop <em>scrcpy</em> on my free time, this feature will probably not be
integrated very soon. Therefore, I prefer to release a working proof-of-concept
which does the job.</p>
Introducing USBaudio2019-06-20T09:40:00+02:00https://blog.rom1v.com/2019/06/introducing-usbaudio<h2 id="forwarding-audio-from-android-devices">Forwarding audio from Android devices</h2>
<p>In order to support audio forwarding in <a href="/2018/03/introducing-scrcpy/">scrcpy</a>, I first implemented an
experimentation on a separate branch (see <a href="https://github.com/Genymobile/scrcpy/issues/14#issuecomment-375103051">issue #14</a>). But it was too hacky
and fragile to be merged (and it does not work on all platforms).</p>
<p>So I decided to write a separate tool: <a href="https://github.com/rom1v/usbaudio">USBaudio</a>.</p>
<p>It works on <em>Linux</em> with <em>PulseAudio</em>.</p>
<h2 id="how-to-use-usbaudio">How to use USBaudio</h2>
<p>First, you need to <a href="https://github.com/rom1v/usbaudio/tree/master/README.md#build">build it</a> (follow the instructions).</p>
<p>Plug an Android device. If USB debugging is enabled, just execute:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">usbaudio</code></pre></figure>
<p>If USB debugging is disabled (or if multiple devices are connected), you need to
specify a device, either by their <em>serial</em> or <em>vendor id</em> and <em>product_id</em> (as
printed by <code class="language-plaintext highlighter-rouge">lsusb</code>):</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">usbaudio <span class="nt">-s</span> 0123456789abcdef
usbaudio <span class="nt">-d</span> 18d1:4ee2</code></pre></figure>
<p>The audio should be played on the computer.</p>
<p>If it’s stuttering, try increasing the <em>live caching</em> value (at the cost of a
higher latency):</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="c"># default is 50ms</span>
usbaudio <span class="nt">--live-caching</span><span class="o">=</span>100</code></pre></figure>
<p><em>Note that it can also be directly captured by <a href="https://obsproject.com/">OBS</a>:</em></p>
<p class="center"><img src="/assets/usbaudio/obs.png" alt="obs" /></p>
<h2 id="how-does-it-work">How does it work?</h2>
<p>USBaudio executes 3 steps successively:</p>
<ol>
<li>It enables audio accessory on the device (by sending <a href="https://source.android.com/devices/accessories/aoa2">AOA</a> requests via
<a href="https://libusb.info/">libusb</a>), so that the audio is forwarded over USB. If it works,
<em>PulseAudio</em> (or <em>ALSA</em>) on the computer should detect a new audio input
source.</li>
<li>It retrieves the <em>PulseAudio</em> input source id associated to the Android
device (via <a href="https://freedesktop.org/software/pulseaudio/doxygen/">libpulse</a>).</li>
<li>It <a href="https://linux.die.net/man/3/exec">exec</a>s VLC to play audio from this input source.</li>
</ol>
<p><em>Note that enabling audio accessory changes the <a href="https://source.android.com/devices/accessories/aoa2#detecting-android-open-accessory-20-support">USB device product id</a>,
so it will close any adb connection (and scrcpy). Therefore, you should enable
audio forwarding <strong>before</strong> running scrcpy.</em></p>
<h2 id="manually">Manually</h2>
<p>To only enable audio accessory without playing:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">usbaudio <span class="nt">-n</span>
usbaudio <span class="nt">--no-play</span></code></pre></figure>
<p>The audio input sources can be listed by:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">pactl list short sources</code></pre></figure>
<p>For example:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ pactl list short sources
...
5 alsa_input.usb-LGE_Nexus_5_05f5e60a0ae518e5-01.analog-stereo module-alsa-card.c s16le 2ch 44100Hz RUNNING
</code></pre></div></div>
<p>Use the number (here <code class="language-plaintext highlighter-rouge">5</code>) to play it with VLC:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">vlc <span class="nt">-Idummy</span> <span class="nt">--live-caching</span><span class="o">=</span>50 pulse://5</code></pre></figure>
<p>Alternatively, you can use ALSA directly:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nb">cat</span> /proc/asound/cards</code></pre></figure>
<p>For example:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cat /proc/asound/cards
...
1 [N5 ]: USB-Audio - Nexus 5
LGE Nexus 5 at usb-0000:00:14.0-4, high speed
</code></pre></div></div>
<p>Use the device number (here <code class="language-plaintext highlighter-rouge">1</code>) as follow:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">vlc <span class="nt">-Idummy</span> <span class="nt">--live-caching</span><span class="o">=</span>50 alsa://hw:1</code></pre></figure>
<p>If it works manually but not automatically (without <code class="language-plaintext highlighter-rouge">-n</code>), then please open an
<a href="https://github.com/rom1v/usbaudio/issues">issue</a>.</p>
<h2 id="limitations">Limitations</h2>
<p>It does not work on all devices, it seems that audio accessory is not always
well supported. But it’s better than nothing.</p>
<p><a href="https://en.wikipedia.org/wiki/Android_Q">Android Q</a> added a new <a href="https://developer.android.com/preview/features/playback-capture">playback capture API</a>. Hopefully,
<em>scrcpy</em> could use it to forward audio in the future (but only for Android Q
devices).</p>
A new core playlist for VLC 42019-05-21T09:25:00+02:00https://blog.rom1v.com/2019/05/a-new-core-playlist-for-vlc-4<p>The <a href="https://wiki.videolan.org/Hacker_Guide/Core/">core</a> playlist in VLC was started <a href="https://git.videolan.org/?p=vlc.git;a=commit;h=57e189eb5d1d387f2036c31720e1e9aa8cb3ea78">a long time
ago</a>. Since then, it has grown to handle too many
different things, to the point it became a kind of <a href="https://en.wikipedia.org/wiki/God_object">god object</a>.</p>
<p>In practice, the playlist was also controlling playback (start, stop, change
volume…), configuring audio and video outputs, storing media detected by
<em>discovery</em>…</p>
<p>For <a href="https://www.youtube.com/watch?v=jzvC-0WCjKU&t=312">VLC 4</a>, we wanted a new playlist API, containing a simple list of
items (instead of a tree), acting as a <em>media provider</em> for a <em>player</em>, without
unrelated responsibilities.</p>
<p>I <a href="https://code.videolan.org/videolan/vlc/commits/76e575307494a1f8ddf6f30266f5e6d7466a7013">wrote it</a> several months ago (at <a href="https://videolabs.io/">Videolabs</a>). Now that the
old one has been <a href="https://code.videolan.org/videolan/vlc/commit/c67934b0b4fc9298cb0784c07f701392589e61b7">removed</a>, it’s time to give some
technical details.</p>
<p class="center"><img src="/assets/vlc_playlist/vlc.png" alt="vlc" /></p>
<ul id="markdown-toc">
<li><a href="#objectives" id="markdown-toc-objectives">Objectives</a></li>
<li><a href="#data-structure" id="markdown-toc-data-structure">Data structure</a></li>
<li><a href="#interaction-with-ui" id="markdown-toc-interaction-with-ui">Interaction with UI</a> <ul>
<li><a href="#desynchronization" id="markdown-toc-desynchronization">Desynchronization</a></li>
<li><a href="#synchronization" id="markdown-toc-synchronization">Synchronization</a> <ul>
<li><a href="#core-to-ui" id="markdown-toc-core-to-ui">Core to UI</a></li>
<li><a href="#ui-to-core" id="markdown-toc-ui-to-core">UI to core</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#random-playback" id="markdown-toc-random-playback">Random playback</a> <ul>
<li><a href="#randomizer" id="markdown-toc-randomizer">Randomizer</a></li>
</ul>
</li>
<li><a href="#sorting" id="markdown-toc-sorting">Sorting</a></li>
<li><a href="#interaction-with-the-player" id="markdown-toc-interaction-with-the-player">Interaction with the player</a></li>
<li><a href="#media-source" id="markdown-toc-media-source">Media source</a></li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
<h2 id="objectives">Objectives</h2>
<p>One major design goal is to expose what <strong>UI frameworks</strong> need. Several user
interfaces, like Qt, Mac OS and Android<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote">1</a></sup>, will use this API to display and
interact with the main VLC playlist.</p>
<p>The playlist must be <strong>performant</strong> for common use cases and usable from
<strong>multiple threads</strong>.</p>
<p>Indeed, in VLC, user interfaces are implemented as <em>modules</em> loaded dynamically.
In general, there is exactly one user interface, but there may be none or (in
theory) several. Thus, the playlist may not be bound to the <a href="https://en.wikipedia.org/wiki/Event_loop">event loop</a> of
some specific user interface. Moreover, the playlist may be modified from a
<em>player</em> thread; for example, playing a zip archive will replace the item by its
content automatically.</p>
<p>As a consequence, the playlist will use a <a href="https://en.wikipedia.org/wiki/Lock_(computer_science)">mutex</a>; to avoid <a href="https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use">ToCToU</a> issues, it
will also expose public functions to lock and unlock it. But as we will see
later, there will be other consequences.</p>
<h2 id="data-structure">Data structure</h2>
<p>User interfaces need <a href="https://en.wikipedia.org/wiki/Random_access">random access</a> to the playlist items, so a <em>vector</em> is the
most natural structure to store the items. A <em>vector</em> is provided by the
standard library of many languages (<a href="https://en.cppreference.com/w/cpp/container/vector"><code class="language-plaintext highlighter-rouge">vector</code></a> in <em>C++</em>, <a href="https://doc.rust-lang.org/std/vec/struct.Vec.html"><code class="language-plaintext highlighter-rouge">Vec</code></a> in <em>Rust</em>,
<a href="https://docs.oracle.com/javase/10/docs/api/java/util/ArrayList.html"><code class="language-plaintext highlighter-rouge">ArrayList</code></a> in <em>Java</em>…). But here, we’re in <em>C</em>, so there is nothing.</p>
<p>In the playlist, we only need a vector of <em>pointers</em>, so I first <a href="https://mailman.videolan.org/pipermail/vlc-devel/2018-July/120434.html">proposed
improvements</a> to an existing type, <code class="language-plaintext highlighter-rouge">vlc_array_t</code>, which only
supports <code class="language-plaintext highlighter-rouge">void *</code> as items. But it was considered useless
(<a href="https://mailman.videolan.org/pipermail/vlc-devel/2018-July/120466.html">1</a>, <a href="https://mailman.videolan.org/pipermail/vlc-devel/2018-July/120509.html">2</a>) because it is too limited and
not type-safe.</p>
<p>Therefore, I wrote <a href="https://code.videolan.org/videolan/vlc/commit/983c43f05928032a14f201c506d6b9c51d0c5145?expanded=1"><code class="language-plaintext highlighter-rouge">vlc_vector</code></a>. It is implemented using macros so that it’s
generic over its item type. For example, we can use a vector of <code class="language-plaintext highlighter-rouge">int</code>s as
follow:</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="c1">// declare and initialize a vector of int</span>
<span class="k">struct</span> <span class="n">VLC_VECTOR</span><span class="p">(</span><span class="kt">int</span><span class="p">)</span> <span class="n">vec</span> <span class="o">=</span> <span class="n">VLC_VECTOR_INITIALIZER</span><span class="p">;</span>
<span class="c1">// append 0, 10, 20, 30 and 40</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="mi">5</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">vlc_vector_push</span><span class="p">(</span><span class="o">&</span><span class="n">vec</span><span class="p">,</span> <span class="mi">10</span> <span class="o">*</span> <span class="n">i</span><span class="p">))</span> <span class="p">{</span>
<span class="c1">// allocation failure...</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// remove item at index 2</span>
<span class="n">vlc_vector_remove</span><span class="p">(</span><span class="mi">2</span><span class="p">);</span>
<span class="c1">// the vector now contains [0, 10, 30, 40]</span>
<span class="kt">int</span> <span class="n">first</span> <span class="o">=</span> <span class="n">vec</span><span class="p">.</span><span class="n">data</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span> <span class="c1">// 0</span>
<span class="kt">int</span> <span class="n">last</span> <span class="o">=</span> <span class="n">vec</span><span class="p">.</span><span class="n">data</span><span class="p">[</span><span class="n">vec</span><span class="p">.</span><span class="n">size</span> <span class="o">-</span> <span class="mi">1</span><span class="p">];</span> <span class="c1">// 40</span>
<span class="c1">// free resources</span>
<span class="n">vlc_vector_destroy</span><span class="p">(</span><span class="o">&</span><span class="n">vec</span><span class="p">);</span></code></pre></figure>
<p>Internally, the playlist uses a <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/src/playlist/playlist.h#L54">vector of playlist items</a>:</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="k">typedef</span> <span class="k">struct</span> <span class="n">VLC_VECTOR</span><span class="p">(</span><span class="k">struct</span> <span class="n">vlc_playlist_item</span> <span class="o">*</span><span class="p">)</span> <span class="n">playlist_item_vector_t</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">vlc_playlist</span> <span class="p">{</span>
<span class="n">playlist_item_vector_t</span> <span class="n">items</span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="p">};</span></code></pre></figure>
<h2 id="interaction-with-ui">Interaction with UI</h2>
<p>UI frameworks typically use <em>list models</em> to bind items to a <em>list view</em>
component. A <em>list model</em> must provide:</p>
<ul>
<li>the <strong>total number of items</strong>,</li>
<li>the <strong>item at a given index</strong>.</li>
</ul>
<p>In addition, the model must notify its view when items are <strong>inserted</strong>,
<strong>removed</strong>, <strong>moved</strong> or <strong>updated</strong>, and when the model is <strong>reset</strong> (the
whole content should be invalidated).</p>
<p>For example, Qt list views use <a href="http://doc.qt.io/qt-5/qabstractitemmodel.html"><code class="language-plaintext highlighter-rouge">QAbstractItemModel</code></a>/<a href="http://doc.qt.io/qt-5/qabstractlistmodel.html"><code class="language-plaintext highlighter-rouge">QAbstractListModel</code></a> and
the Android <a href="https://developer.android.com/reference/androidx/recyclerview/widget/RecyclerView.html">recycler view</a> uses <a href="https://developer.android.com/reference/androidx/recyclerview/widget/RecyclerView.Adapter.html"><code class="language-plaintext highlighter-rouge">RecyclerView.Adapter</code></a>.</p>
<p>The playlist API exposes the functions and callbacks providing these features.</p>
<h3 id="desynchronization">Desynchronization</h3>
<p>However, <strong>the core playlist may not be used as a direct data source for a list
model</strong>. In other words, the functions of a list model must not delegate the
calls to the core playlist.</p>
<p>To understand why, let’s consider a typical sequence of calls executed by a view
on its model, from the UI thread:</p>
<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="n">model</span><span class="p">.</span><span class="n">count</span><span class="p">();</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">2</span><span class="p">);</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">4</span><span class="p">);</span></code></pre></figure>
<p>If we implemented <code class="language-plaintext highlighter-rouge">count()</code> and <code class="language-plaintext highlighter-rouge">get(index)</code> by delegating to the playlist, we
would have to lock each call individually:</p>
<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="c1">// in some artificial UI framework in C++</span>
<span class="kt">int</span> <span class="n">MyModel</span><span class="o">::</span><span class="n">count</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// don't do this</span>
<span class="n">vlc_playlist_Lock</span><span class="p">(</span><span class="n">playlist</span><span class="p">);</span>
<span class="kt">int</span> <span class="n">count</span> <span class="o">=</span> <span class="n">vlc_playlist_Count</span><span class="p">();</span>
<span class="n">vlc_playlist_Unlock</span><span class="p">(</span><span class="n">playlist</span><span class="p">);</span>
<span class="k">return</span> <span class="n">count</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">vlc_playlist_item_t</span> <span class="o">*</span><span class="n">MyModel</span><span class="o">::</span><span class="n">get</span><span class="p">(</span><span class="kt">int</span> <span class="n">index</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// don't do this</span>
<span class="n">vlc_playlist_Lock</span><span class="p">(</span><span class="n">playlist</span><span class="p">);</span>
<span class="n">vlc_playlist_item_t</span> <span class="o">*</span><span class="n">item</span> <span class="o">=</span> <span class="n">vlc_playlist_Get</span><span class="p">(</span><span class="n">playlist</span><span class="p">,</span> <span class="n">index</span><span class="p">);</span>
<span class="n">vlc_playlist_Unlock</span><span class="p">(</span><span class="n">playlist</span><span class="p">);</span>
<span class="k">return</span> <span class="n">item</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>Note that locking and unlocking from the UI thread for every playlist item is
not a good idea for responsiveness, but this is a minor issue here.</p>
<p>The real problem is that locking is not sufficient to guarantee correctness: the
<em>list view</em> expects its model to return consistent values. Our implementation
can break this assumption, because the playlist content could change
asynchronously between calls. Here is an example:</p>
<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="c1">// the playlist initially contains 5 items: [A, B, C, D, E]</span>
<span class="n">model</span><span class="p">.</span><span class="n">count</span><span class="p">();</span> <span class="c1">// 5</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span> <span class="c1">// A</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span> <span class="c1">// B</span>
<span class="c1">// the first playlist item is removed from another thread:</span>
<span class="c1">// vlc_playlist_RemoveOne(playlist, 0);</span>
<span class="c1">// the playlist now contains [B, C, D, E]</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">2</span><span class="p">);</span> <span class="c1">// D</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span> <span class="c1">// E</span>
<span class="n">model</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="mi">4</span><span class="p">);</span> <span class="c1">// out-of-range, undefined behavior (probably segfault)</span></code></pre></figure>
<p>The view could not process any notification of the item removal before the end
of the current execution in its event loop… that is, at least after
<code class="language-plaintext highlighter-rouge">model.get(4)</code>. To avoid this problem, <strong>the data provided by view models must
always <em>live</em> in the UI thread</strong>.</p>
<p>This implies that the UI has to manage <strong>a copy of the playlist content</strong>. The
UI playlist should be considered as a remote out-of-sync view of the core
playlist.</p>
<p>Note that the copy must not be limited to the list of <em>pointers</em> to playlist
items: the content which is displayed and susceptible to change asynchronously
(media metadata, like <em>title</em> or <em>duration</em>) must also be copied. The UI needs a
<strong>deep copy</strong>; otherwise, the content could change (and be exposed) before the
<em>list view</em> was notified… which, again, would break assumptions about the
model.</p>
<h3 id="synchronization">Synchronization</h3>
<p>The core playlist and the UI playlist are out-of-sync. So we need to
“synchronize” them:</p>
<ul>
<li>the changes on the core playlist must be reflected to the UI views,</li>
<li>the user, via the UI, may request changes to the core playlist (to add, move
or remove items for example).</li>
</ul>
<h4 id="core-to-ui">Core to UI</h4>
<p>The core playlist is the <em>source of truth</em>.</p>
<p>Every change to the UI playlist must occur in the UI thread, yet the core
playlist notification handlers are executed from any thread. Therefore, playlist
callback handlers must retrieve appropriate data from the playlist, then <em>post</em>
an event to the UI event loop<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote">2</a></sup>, which will be handled from the UI thread.
From there, the core playlist will be out-of-sync, so it would be incorrect to
access it.</p>
<p>The <strong>order of events</strong> forwarded to the UI thread <strong>must be preserved</strong>. That
way, the indices notified by the core playlist are necessarily valid within the
context of the <em>list model</em> in the UI thread. The core playlist events can be
understood as a sequence of “patches” that the UI playlist must apply to its own
copy.</p>
<p>This only works if <strong>only the core playlist callbacks modify the <em>list model</em>
content</strong>.</p>
<h4 id="ui-to-core">UI to core</h4>
<p>Since the <em>list model</em> can only be modified by the core playlist callbacks, it
is incorrect to modify it on user actions. As a consequence, the changes must be
requested to the core playlist, which will, in turn, notify the actual changes.</p>
<p>The synchronization is more tricky in that direction.</p>
<p>To understand why, suppose the user selects items 10 to 20, then drag & drop to
move them to index 42. Once the user releases the mouse button to “drop” the
items, we need to lock the core playlist to apply the changes.</p>
<p>The problem is that, before we successfully acquired the lock, another client
may have modified the playlist: it may have cleared it, or shuffled it, or
removed items 5 to 15… As a consequence, we cannot apply the “move” request as
is, because it was created from a previous playlist state.</p>
<p>To solve the issue, we need to adapt the request to make it fit the current
playlist state. In other words, resolve conflicts: find the items if they had
been moved, ignore the items not found for removal…</p>
<p>For that purpose, in addition to functions modifying the content directly, the
playlist exposes functions to <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_playlist.h#L548">request</a> “desynchronized”
changes, which automatically resolve conflicts and generate an appropriate
sequence of events to notify the clients of the <em>actual</em> changes.</p>
<p>Let’s take an example. Initially, our playlist contains 10 items:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[A, B, C, D, E, F, G, H, I, J]
</code></pre></div></div>
<p>The user selects <code class="language-plaintext highlighter-rouge">[C, D, E, F, G]</code> and press the <code class="language-plaintext highlighter-rouge">Del</code> key to remove the items.
To apply the change, we need to lock the core playlist.</p>
<p>But at that time, another thread was holding the lock to apply some other
changes. It removed <code class="language-plaintext highlighter-rouge">F</code> and <code class="language-plaintext highlighter-rouge">I</code>, and shuffled the playlist:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[E, B, D, J, C, G, H, A]
</code></pre></div></div>
<p>Once the other thread unlocks the playlist, our lock finally succeeds. Then, we
call <code class="language-plaintext highlighter-rouge">request_remove([C, D, E, F, G])</code> (this is pseudo-code, the real function
is <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_playlist.h#L597"><code class="language-plaintext highlighter-rouge">vlc_playlist_RequestRemove</code></a>).</p>
<p>Internally, it triggers several calls:</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="c1">// [E, B, D, J, C, G, H, A]</span>
<span class="n">remove</span><span class="p">(</span><span class="n">index</span> <span class="o">=</span> <span class="mi">4</span><span class="p">,</span> <span class="n">count</span> <span class="o">=</span> <span class="mi">2</span><span class="p">)</span> <span class="c1">// remove [C, G]</span>
<span class="c1">// [E, B, D, J, H, A]</span>
<span class="n">remove</span><span class="p">(</span><span class="n">index</span> <span class="o">=</span> <span class="mi">2</span><span class="p">,</span> <span class="n">count</span> <span class="o">=</span> <span class="mi">1</span><span class="p">)</span> <span class="c1">// remove [D]</span>
<span class="c1">// [E, B, J, H, A]</span>
<span class="n">remove</span><span class="p">(</span><span class="n">index</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">count</span> <span class="o">=</span> <span class="mi">1</span><span class="p">)</span> <span class="c1">// remove [E]</span>
<span class="c1">// [B, J, H, A]</span></code></pre></figure>
<p>Thus, every client (including the UI from which the user requested to remove the
items), will receive a sequence of 3 events <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_playlist.h#L204"><code class="language-plaintext highlighter-rouge">on_items_removed</code></a>, corresponding
to each removed slice.</p>
<p>The slices are removed in descending order for both optimization (it minimizes
the number of shifts) and simplicity (the index of a removal does not depend on
previous removals).</p>
<p>In practice, it is very likely that the request will apply exactly to the
current state of the playlist. To avoid unnecessary linear searches to find the
items, these functions accept an additional <code class="language-plaintext highlighter-rouge">index_hint</code> parameter, giving the
index of the items when the request was created. It should (hopefully) almost
always be the same as the index in the current playlist state.</p>
<h2 id="random-playback">Random playback</h2>
<p>Contrary to <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_playlist.h#L623">shuffle</a>, random playback does not move the items within the
playlist; instead, it does not play them sequentially.</p>
<p>To select the next item to play, we could just pick one at random.</p>
<p>But this is not ideal: some items will be selected several times (possibly in a
row) while some others will not be selected at all. And if <em>loop</em> is disabled,
when should we stop? After all <em>n</em> items have been selected at least once or
after <em>n</em> playbacks?</p>
<p>Instead, we would like some desirable properties that work both with <em>loop</em>
enabled and disabled:</p>
<ul>
<li>an item must never be selected twice (within a cycle, if <em>loop</em> is enabled),</li>
<li>we should be able to navigate back to the previously selected items,</li>
<li>we must able to force the selection of a specific item (typically when the
user double-clicks on an item in the playlist),</li>
<li>insertions and removals must be taken into account at any time,</li>
</ul>
<p>In addition, if <em>loop</em> is enabled:</p>
<ul>
<li>the random order must be recomputed for very cycle (we don’t always want the
<a href="https://xkcd.com/221/">same random</a> order),</li>
<li>it should be possible to navigate back to previous items from the previous
cycle,</li>
<li>an item must never be selected twice in a row (in particular, it may not be
the last item of one cycle and the first item of the next cycle).</li>
</ul>
<h3 id="randomizer">Randomizer</h3>
<p>I wrote a <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/src/playlist/randomizer.c"><code class="language-plaintext highlighter-rouge">randomizer</code></a> to select items “randomly” within all these
constraints.</p>
<p>To get an idea of the results, here is a sequence produced for a playlist
containing 5 items (<code class="language-plaintext highlighter-rouge">A</code>, <code class="language-plaintext highlighter-rouge">B</code>, <code class="language-plaintext highlighter-rouge">C</code>, <code class="language-plaintext highlighter-rouge">D</code> and <code class="language-plaintext highlighter-rouge">E</code>), with <em>loop</em> enabled (so that it
continues indefinitely):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>E D A B C E B C A D C B E D A C E A D B A D C E B A B D E C B C A E D E D B C A
E C B D A C A E B D C D E A B E D B A C D C B A E D A B C E B D C A E D C A B E
B A E C D C E D A B C E B A D E C B D A D B A C E C E B A D B C E D A E A C B D
A D E B C D C A E B E A D C B C D B A E C E A B D C D E A B D A E C B C A D B E
A B E C D A C B E D E D A B C D E C A B C A E B D E B D C A C A E D B D B E C A
</code></pre></div></div>
<p>Here is how it works.</p>
<p>The <em>randomizer</em> stores a single <em>vector</em> containing all the items of the
playlist. This <em>vector</em> is not shuffled at once. Instead, steps of the
<a href="https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle">Fisher-Yates</a> algorithm are executed one-by-one on demand. This has several
advantages:</p>
<ul>
<li>on insertions and removals, there is no need to reshuffle or shift the
whole array;</li>
<li>if <em>loop</em> is enabled, the history of the last cycle can be kept in place.</li>
</ul>
<p>It also maintains 3 indexes:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">head</code> indicates the end of the items already determinated for the current
cycle (if <em>loop</em> is disabled, there is only one cycle),</li>
<li><code class="language-plaintext highlighter-rouge">next</code> points to the item after the current one<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote">3</a></sup>,</li>
<li><code class="language-plaintext highlighter-rouge">history</code> points to the first item of ordered history from the last cycle.</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0 next head history size
|---------------|-----|.............|-------------|
<-------------------> <----------->
determinated range history range
</code></pre></div></div>
<p><em>Let’s reuse the example I wrote in the <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/src/playlist/randomizer.c#L87">documentation</a>.</em></p>
<p>Here is the initial state with our 5 items:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
head |
| |
A B C D E
</code></pre></div></div>
<p>The playlist calls <code class="language-plaintext highlighter-rouge">Next()</code> to retrieve the next random item. The randomizer
picks one item (say, <code class="language-plaintext highlighter-rouge">D</code>), and swaps it with the current head (<code class="language-plaintext highlighter-rouge">A</code>). <code class="language-plaintext highlighter-rouge">Next()</code>
returns <code class="language-plaintext highlighter-rouge">D</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
head |
| |
D B C A E
<--->
determinated range
</code></pre></div></div>
<p>The playlist calls <code class="language-plaintext highlighter-rouge">Next()</code> one more time. The randomizer selects one item
outside the determinated range (say, <code class="language-plaintext highlighter-rouge">E</code>). <code class="language-plaintext highlighter-rouge">Next()</code> returns <code class="language-plaintext highlighter-rouge">E</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
head |
| |
D E C A B
<-------->
determinated range
</code></pre></div></div>
<p>The playlist calls <code class="language-plaintext highlighter-rouge">Next()</code> one more time. The randomizer selects <code class="language-plaintext highlighter-rouge">C</code> (already
in place). <code class="language-plaintext highlighter-rouge">Next()</code> returns <code class="language-plaintext highlighter-rouge">C</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
head |
| |
D E C A B
<------------->
determinated range
</code></pre></div></div>
<p>The playlist then calls <code class="language-plaintext highlighter-rouge">Prev()</code>. Since the “current” item is <code class="language-plaintext highlighter-rouge">C</code>, the previous
one is <code class="language-plaintext highlighter-rouge">E</code>, so <code class="language-plaintext highlighter-rouge">Prev()</code> returns <code class="language-plaintext highlighter-rouge">E</code>, and <code class="language-plaintext highlighter-rouge">next</code> moves back.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
| head |
| | |
D E C A B
<------------->
determinated range
</code></pre></div></div>
<p>The playlist calls <code class="language-plaintext highlighter-rouge">Next()</code>, which returns <code class="language-plaintext highlighter-rouge">C</code>, as expected.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
head |
| |
D E C A B
<------------->
determinated range
</code></pre></div></div>
<p>The playlist calls <code class="language-plaintext highlighter-rouge">Next()</code>, the randomizer selects <code class="language-plaintext highlighter-rouge">B</code>, and returns it.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
head |
| |
D E C B A
<------------------>
determinated range
</code></pre></div></div>
<p>The playlist calls <code class="language-plaintext highlighter-rouge">Next()</code>, the randomizer selects the last item (it has no
choice). <code class="language-plaintext highlighter-rouge">next</code> and <code class="language-plaintext highlighter-rouge">head</code> now point one item past the end (their value is
the vector size).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next
head
|
D E C B A
<----------------------->
determinated range
</code></pre></div></div>
<p>At this point, if <em>loop</em> is disabled, it is not possible to call <code class="language-plaintext highlighter-rouge">Next()</code>
anymore (<code class="language-plaintext highlighter-rouge">HasNext()</code> returns <code class="language-plaintext highlighter-rouge">false</code>). So let’s enable it by calling
<code class="language-plaintext highlighter-rouge">SetLoop()</code>, then let’s call <code class="language-plaintext highlighter-rouge">Next()</code> again.</p>
<p>This will start a new loop cycle. Firstly, <code class="language-plaintext highlighter-rouge">next</code> and <code class="language-plaintext highlighter-rouge">head</code> are reset, and
the whole vector belongs to the last cycle history.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next
head
|
D E C B A
<------------------------>
history range
</code></pre></div></div>
<p>Secondly, to avoid selecting <code class="language-plaintext highlighter-rouge">A</code> twice in a row (as the last item of the
previous cycle and the first item of the new one), the randomizer will
immediately determine another item in the vector (say <code class="language-plaintext highlighter-rouge">C</code>) to be the first of
the new cycle. The items that belong to the history are kept in order.
<code class="language-plaintext highlighter-rouge">head</code> and <code class="language-plaintext highlighter-rouge">history</code> move forward.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
| head
| |
C D E B A
<---><------------------>
determinated history range
range
</code></pre></div></div>
<p>Finally, it will actually select and return the first item (<code class="language-plaintext highlighter-rouge">C</code>).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next
head
|
C D E B A
<---><------------------>
determinated history range
range
</code></pre></div></div>
<p>Then, the user adds an item to the playlist (<code class="language-plaintext highlighter-rouge">F</code>). This item is added in front
of history.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
head |
| |
C F D E B A
<---> <------------------>
determinated history range
range
</code></pre></div></div>
<p>The playlist calls <code class="language-plaintext highlighter-rouge">Next()</code>, the randomizer randomly selects <code class="language-plaintext highlighter-rouge">E</code>. <code class="language-plaintext highlighter-rouge">E</code>
“disappears” from the history of the last cycle. This is a general property:
each item may not appear more than once in the “history” (both from the last
and the new cycle). The history order is preserved.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
next |
head |
| |
C E F D B A
<--------> <-------------->
determinated history range
range
</code></pre></div></div>
<p>The playlist then calls <code class="language-plaintext highlighter-rouge">Prev()</code> 3 times, that yields <code class="language-plaintext highlighter-rouge">C</code>, then <code class="language-plaintext highlighter-rouge">A</code>, then <code class="language-plaintext highlighter-rouge">B</code>.
<code class="language-plaintext highlighter-rouge">next</code> is decremented (modulo size) on each call.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> history
| next
head | |
| | |
C E F D B A
<--------> <-------------->
determinated history range
range
</code></pre></div></div>
<p>Hopefully, the resulting randomness will match what people expect in practice.</p>
<h2 id="sorting">Sorting</h2>
<p>The playlist can be <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_playlist.h#L631">sorted</a> by an ordered list of criteria (a
<a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_playlist.h#L128">key</a> and a <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_playlist.h#L143">order</a>).</p>
<p>The implementation is complicated by the fact that items metadata can change
asynchronously (for example if the player is parsing it), making the comparison
function inconsistent.</p>
<p>To avoid the problem, a first pass builds a <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/src/playlist/sort.c#L374">list of metadata</a> for all items, then
this list is <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/src/playlist/sort.c#L381">sorted</a>, and finally the resulting order is <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/src/playlist/sort.c#L383">applied back</a> to the
playlist.</p>
<p>As a benefit, the items are locked only once to retrieved their metadata.</p>
<h2 id="interaction-with-the-player">Interaction with the player</h2>
<p>For VLC 4, <a href="https://twitter.com/tguill3m">Thomas</a> wrote a <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_player.h">new player API</a>.</p>
<p>A <em>player</em> can be used without a <em>playlist</em>: we can set its <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_player.h#L1082">current media</a> and
the player can request, when necessary, the next media to play from a <a href="https://code.videolan.org/videolan/vlc/blob/519877e327bb86aba1f4861412792c63248564d6/include/vlc_player.h#L354">media
provider</a>.</p>
<p>A <em>playlist</em>, on the other hand, needs a <em>player</em>, and registers itself as its
media provider. They are tightly coupled:</p>
<ul>
<li>the playlist controls the player on user actions;</li>
<li>the player events may update the playlist state.</li>
</ul>
<p>To keep them synchronized:</p>
<ul>
<li>on user actions, we need to lock the playlist, then the player;</li>
<li>on player events, we need to lock the player, then the playlist.</li>
</ul>
<p>This poses a lock-order inversion problem: for example, if thread A locks the
playlist then waits for the player lock, while thread B locks the player then
waits for the playlist lock, then thread A and B are <a href="https://en.wikipedia.org/wiki/Deadlock">deadlocked</a>.</p>
<p>To avoid the problem, the <em>player</em> and the <em>playlist</em> share the same lock.
Concretely, <code class="language-plaintext highlighter-rouge">vlc_playlist_Lock()</code> delegates to <code class="language-plaintext highlighter-rouge">vlc_player_Lock()</code>. In practice,
the lock should be held only for short periods of time.</p>
<h2 id="media-source">Media source</h2>
<p>A separate API (<a href="https://code.videolan.org/videolan/vlc/commit/3e0cc1942a963693cf97c99a5ab1e9c6171fe6b1"><em>media source</em> and <em>media tree</em></a>) was necessary
to expose what is called <em>services discovery</em> (used to detect media from various
sources like <a href="https://fr.wikipedia.org/wiki/Server_Message_Block">Samba</a> or <a href="https://en.wikipedia.org/wiki/Media_Transfer_Protocol">MTP</a>), which were previously managed by the old
playlist.</p>
<p>Thus, we could <a href="https://code.videolan.org/videolan/vlc/commit/c67934b0b4fc9298cb0784c07f701392589e61b7">kill</a> the old playlist.</p>
<h2 id="conclusion">Conclusion</h2>
<p>The new playlist and player API should help to implement UI properly <em>(spoiler:
a new <a href="https://www.youtube.com/watch?v=jzvC-0WCjKU&t=841">modern UI</a> is being developed)</em>, to avoid racy bugs and to implement new
features <em>(spoiler: <a href="https://en.wikipedia.org/wiki/Gapless_playback">gapless</a>)</em>.</p>
<p><em>Discuss on <a href="https://www.reddit.com/r/programming/comments/br7or7/a_new_core_playlist_for_vlc_4/">reddit</a> and <a href="https://news.ycombinator.com/item?id=19978295">Hacker News</a>.</em></p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Actually, the <em>Android</em> app will maybe continue to implement its own
playlist in Java/Kotlin, to avoid additional layers (Java/JNI and LibVLC). <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>Even in the case where a core playlist callback is executed from the UI
thread, the event must be posted to the event queue, to avoid breaking
the order. Concretely, in Qt, this means connecting signals to slots using
<a href="http://doc.qt.io/qt-5/qt.html#ConnectionType-enum"><code class="language-plaintext highlighter-rouge">Qt::QueuedConnection</code></a> instead of the default <code class="language-plaintext highlighter-rouge">Qt::AutoConnection</code>. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>We use <code class="language-plaintext highlighter-rouge">next</code> instead of <code class="language-plaintext highlighter-rouge">current</code> so that all indexes are unsigned, while
<code class="language-plaintext highlighter-rouge">current</code> could be <code class="language-plaintext highlighter-rouge">-1</code>. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Implementing tile encoding in rav1e2019-04-25T10:15:00+02:00https://blog.rom1v.com/2019/04/implementing-tile-encoding-in-rav1e<p>During the last few months at <a href="https://videolabs.io">Videolabs</a>, I added support for <a href="https://github.com/xiph/rav1e/pull/1126">tile
encoding</a> in <a href="https://github.com/xiph/rav1e">rav1e</a> (a Rust AV1 Encoder).</p>
<h2 id="what-is-this">What is this?</h2>
<p><a href="https://en.wikipedia.org/wiki/AV1">AV1</a> is an open and royalty-free video coding format, concurrent with <a href="https://en.wikipedia.org/wiki/HEVC">HEVC</a>
(H.265).</p>
<p><a href="https://www.youtube.com/watch?v=ytsRYKQc6kQ">Rav1e</a> is an encoder written in <a href="https://www.rust-lang.org/">Rust</a>, developped by
<a href="https://research.mozilla.org/av1-media-codecs/">Mozilla</a>/<a href="https://xiph.org/">Xiph</a>. As such, it takes an input video and encodes it
to produce a valid AV1 bitstream.</p>
<h3 id="tile-encoding">Tile encoding</h3>
<p>Tile encoding consists in splitting video frames into <em>tiles</em> that can be
encoded and decoded independently in parallel (to use several CPUs), at the cost
of a small loss in compression efficiency.</p>
<p>This speeds up encoding and increases decoding frame rate.</p>
<p class="center"><a href="/assets/rav1e_tile_encoding/tiles.jpg"><img src="/assets/rav1e_tile_encoding/tiles.jpg" alt="tiles" /></a><br />
<em>8 tiles (4 colums × 2 rows)</em></p>
<h2 id="preliminary-work">Preliminary work</h2>
<p>To prepare for tiling, some refactoring was necessary.</p>
<p>A <em>frame</em> contains 3 <em>planes</em> (one for each <a href="https://en.wikipedia.org/wiki/YUV">YUV</a> component, possibly
<a href="https://en.wikipedia.org/wiki/Chroma_subsampling">subsampled</a>). Each plane is stored in a contiguous array, rows after rows.</p>
<p>To illustrate it, here is a mini-plane containing 6×3 pixels. Padding is
added for alignment (and other details), so its physical size is 8×4 pixels:</p>
<p class="center no-radius"><img src="/assets/rav1e_tile_encoding/plane.png" alt="plane" /></p>
<p>In memory, it is stored in a single array:</p>
<p class="center no-radius"><img src="/assets/rav1e_tile_encoding/plane_memory.png" alt="plane memory" /></p>
<p>The number of array items separating one pixel to the one below is called the
<a href="https://en.wikipedia.org/wiki/Stride_of_an_array">stride</a>. Here, the stride is 8.</p>
<p>The encoder often needs to process rectangular regions. For that purpose, many
functions received a slice of the plane array and the stride value:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">fn</span> <span class="nf">write_forty_two</span><span class="p">(</span><span class="n">slice</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u16</span><span class="p">],</span> <span class="n">stride</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">y</span> <span class="n">in</span> <span class="mi">0</span><span class="o">..</span><span class="mi">2</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">x</span> <span class="n">in</span> <span class="mi">0</span><span class="o">..</span><span class="mi">4</span> <span class="p">{</span>
<span class="n">slice</span><span class="p">[</span><span class="n">y</span> <span class="o">*</span> <span class="n">stride</span> <span class="o">+</span> <span class="n">x</span><span class="p">]</span> <span class="o">=</span> <span class="mi">42</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>This works fine, but the plane slice spans multiple rows.</p>
<p>Let’s split our planes into 4 tiles (2 columns × 2 rows):</p>
<p class="center no-radius"><img src="/assets/rav1e_tile_encoding/plane_regions.png" alt="plane regions" /></p>
<p>In memory, the resulting plane regions are not contiguous:</p>
<p class="center no-radius"><img src="/assets/rav1e_tile_encoding/plane_regions_memory.png" alt="plane regions memory" /></p>
<p>In Rust, it is not sufficient not to read/write the same memory from several
threads, it must be impossible to write (safe) code that could do it. More
precisely, a mutable reference may not alias any other reference to the same
memory.</p>
<p>As a consequence, passing <strong>a mutable slice</strong> (<code class="language-plaintext highlighter-rouge">&mut [u16]</code>) <strong>spanning multiple
rows is incompatible with tiling</strong>. Instead, we need some structure, implemented
with <a href="https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html">unsafe</a> code, providing a view of the authorized region of the underlying
plane.</p>
<p>As a first step, I replaced every piece of code which used a raw slice and the
stride value by the existing <a href="https://github.com/xiph/rav1e/pull/1035"><code class="language-plaintext highlighter-rouge">PlaneSlice</code></a> and <a href="https://github.com/xiph/rav1e/pull/1043"><code class="language-plaintext highlighter-rouge">PlaneMutSlice</code></a>
structures (which first required to <a href="https://github.com/xiph/rav1e/pull/1002">make planes generic</a> after
<a href="https://github.com/xiph/rav1e/pull/996">improving the <code class="language-plaintext highlighter-rouge">Pixel</code> trait</a>).</p>
<p>After these changes, our function could be rewritten as follow:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">fn</span> <span class="n">write_forty_two</span><span class="o"><</span><span class="n">T</span><span class="p">:</span> <span class="n">Pixel</span><span class="o">></span><span class="p">(</span><span class="n">slice</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">PlaneMutSlice</span><span class="o"><</span><span class="nv">'_</span><span class="p">,</span> <span class="n">T</span><span class="o">></span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">y</span> <span class="n">in</span> <span class="mi">0</span><span class="o">..</span><span class="mi">2</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">x</span> <span class="n">in</span> <span class="mi">0</span><span class="o">..</span><span class="mi">4</span> <span class="p">{</span>
<span class="n">slice</span><span class="p">[</span><span class="n">y</span><span class="p">][</span><span class="n">x</span><span class="p">]</span> <span class="o">=</span> <span class="mi">42</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<h2 id="tiling-structures">Tiling structures</h2>
<p>So now, all the code using a raw slice and a stride value has been replaced. But
if we look at the definition of <a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/plane.rs#L582-L586"><code class="language-plaintext highlighter-rouge">PlaneMutSlice</code></a>, we see that it still borrows
the whole plane:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">PlaneMutSlice</span><span class="o"><</span><span class="nv">'a</span><span class="p">,</span> <span class="n">T</span><span class="p">:</span> <span class="n">Pixel</span><span class="o">></span> <span class="p">{</span>
<span class="k">pub</span> <span class="n">plane</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="n">Plane</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">,</span>
<span class="k">pub</span> <span class="n">x</span><span class="p">:</span> <span class="nb">isize</span><span class="p">,</span>
<span class="k">pub</span> <span class="n">y</span><span class="p">:</span> <span class="nb">isize</span>
<span class="p">}</span></code></pre></figure>
<p>So the refactoring, in itself, does not solves the problem.</p>
<p>What is needed now is a structure that exposes a bounded region of the plane.</p>
<h3 id="minimal-example">Minimal example</h3>
<p>For illustration purpose, let’s consider a minimal example, solving a similar
problem: <strong>split a matrix into columns</strong>.</p>
<p class="center no-radius"><img src="/assets/rav1e_tile_encoding/2d_array.png" alt="2D array" /></p>
<p>In memory, the matrix is stored in a single array:</p>
<p class="center no-radius"><img src="/assets/rav1e_tile_encoding/2d_array_memory.png" alt="2D array memory" /></p>
<p>To do so, let’s define a <code class="language-plaintext highlighter-rouge">ColumnMut</code> type, and split the raw array into columns:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">marker</span><span class="p">::</span><span class="n">PhantomData</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ops</span><span class="p">::{</span><span class="nb">Index</span><span class="p">,</span> <span class="n">IndexMut</span><span class="p">};</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">ColumnMut</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">data</span><span class="p">:</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u8</span><span class="p">,</span>
<span class="n">cols</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span>
<span class="n">rows</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span>
<span class="n">phantom</span><span class="p">:</span> <span class="n">PhantomData</span><span class="o"><&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="nb">u8</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="nb">Index</span><span class="o"><</span><span class="nb">usize</span><span class="o">></span> <span class="k">for</span> <span class="n">ColumnMut</span><span class="o"><</span><span class="nv">'_</span><span class="o">></span> <span class="p">{</span>
<span class="k">type</span> <span class="n">Output</span> <span class="o">=</span> <span class="nb">u8</span><span class="p">;</span>
<span class="k">fn</span> <span class="nf">index</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">index</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="nn">Self</span><span class="p">::</span><span class="n">Output</span> <span class="p">{</span>
<span class="k">assert</span><span class="o">!</span><span class="p">(</span><span class="n">index</span> <span class="o"><</span> <span class="k">self</span><span class="py">.rows</span><span class="p">);</span>
<span class="k">unsafe</span> <span class="p">{</span> <span class="o">&*</span><span class="k">self</span><span class="py">.data</span><span class="nf">.add</span><span class="p">(</span><span class="n">index</span> <span class="o">*</span> <span class="k">self</span><span class="py">.cols</span><span class="p">)</span> <span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">IndexMut</span><span class="o"><</span><span class="nb">usize</span><span class="o">></span> <span class="k">for</span> <span class="n">ColumnMut</span><span class="o"><</span><span class="nv">'_</span><span class="o">></span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">index_mut</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">index</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="k">mut</span> <span class="nn">Self</span><span class="p">::</span><span class="n">Output</span> <span class="p">{</span>
<span class="k">assert</span><span class="o">!</span><span class="p">(</span><span class="n">index</span> <span class="o"><</span> <span class="k">self</span><span class="py">.rows</span><span class="p">);</span>
<span class="k">unsafe</span> <span class="p">{</span> <span class="o">&</span><span class="k">mut</span> <span class="o">*</span><span class="k">self</span><span class="py">.data</span><span class="nf">.add</span><span class="p">(</span><span class="n">index</span> <span class="o">*</span> <span class="k">self</span><span class="py">.cols</span><span class="p">)</span> <span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">columns</span><span class="p">(</span>
<span class="n">slice</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">cols</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span>
<span class="p">)</span> <span class="k">-></span> <span class="k">impl</span> <span class="n">Iterator</span><span class="o"><</span><span class="n">Item</span> <span class="o">=</span> <span class="n">ColumnMut</span><span class="o"><</span><span class="nv">'_</span><span class="o">>></span> <span class="p">{</span>
<span class="k">assert</span><span class="o">!</span><span class="p">(</span><span class="n">slice</span><span class="nf">.len</span><span class="p">()</span> <span class="o">%</span> <span class="n">cols</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
<span class="k">let</span> <span class="n">rows</span> <span class="o">=</span> <span class="n">slice</span><span class="nf">.len</span><span class="p">()</span> <span class="o">/</span> <span class="n">cols</span><span class="p">;</span>
<span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="n">cols</span><span class="p">)</span><span class="nf">.map</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">col</span><span class="p">|</span> <span class="n">ColumnMut</span> <span class="p">{</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">slice</span><span class="p">[</span><span class="n">col</span><span class="p">],</span>
<span class="n">cols</span><span class="p">,</span>
<span class="n">rows</span><span class="p">,</span>
<span class="n">phantom</span><span class="p">:</span> <span class="n">PhantomData</span><span class="p">,</span>
<span class="p">})</span>
<span class="p">}</span></code></pre></figure>
<p>The <a href="https://doc.rust-lang.org/nomicon/phantom-data.html"><code class="language-plaintext highlighter-rouge">PhantomData</code></a> is necessary to bind the lifetime (in practice,
when we store a raw pointer, we often need a <code class="language-plaintext highlighter-rouge">PhantomData</code>).</p>
<p>We implemented <a href="https://doc.rust-lang.org/std/ops/trait.Index.html"><code class="language-plaintext highlighter-rouge">Index</code></a> and <a href="https://doc.rust-lang.org/std/ops/trait.IndexMut.html"><code class="language-plaintext highlighter-rouge">IndexMut</code></a> traits to provide operator <code class="language-plaintext highlighter-rouge">[]</code>:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c">// via Index trait</span>
<span class="k">let</span> <span class="n">value</span> <span class="o">=</span> <span class="n">column</span><span class="p">[</span><span class="n">y</span><span class="p">];</span>
<span class="c">// via IndexMut trait</span>
<span class="n">column</span><span class="p">[</span><span class="n">y</span><span class="p">]</span> <span class="o">=</span> <span class="n">value</span><span class="p">;</span></code></pre></figure>
<p>The iterator returned by <code class="language-plaintext highlighter-rouge">columns()</code> yields a different column every time, so
the borrowing rules are respected.</p>
<p>Now, we can read from and write to a matrix via temporary column views:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span>
<span class="mi">4</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span>
<span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">];</span>
<span class="c">// for each column, write the sum</span>
<span class="nf">columns</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">data</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span><span class="nf">.for_each</span><span class="p">(|</span><span class="k">mut</span> <span class="n">col</span><span class="p">|</span> <span class="n">col</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">col</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">col</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span>
<span class="mi">4</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span>
<span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">9</span><span class="p">]);</span>
<span class="p">}</span></code></pre></figure>
<p>Even if the columns are interlaced in memory, from a <code class="language-plaintext highlighter-rouge">ColumnMut</code> instance, it is
not possible to access data belonging to another column.</p>
<p><em>Note that <code class="language-plaintext highlighter-rouge">cols</code> and <code class="language-plaintext highlighter-rouge">rows</code> fields must be kept private, otherwise they could
be changed from safe code in such a way that breaks boundaries and violates
borrowing rules.</em></p>
<h3 id="in-rav1e">In rav1e</h3>
<p>A plane is split in a similar way, except that it provides <em>plane regions</em>
instead of <em>colums</em>.</p>
<p>The split is <em>recursive</em>. For example, a <a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/encoder.rs#L45-L47"><code class="language-plaintext highlighter-rouge">Frame</code></a> contains 3 <a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/plane.rs#L136-L139"><code class="language-plaintext highlighter-rouge">Plane</code></a>s, so a
<a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/tiling/tile.rs#L98-L100"><code class="language-plaintext highlighter-rouge">Tile</code></a> contains 3 <a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/tiling/plane_region.rs#L109-L115"><code class="language-plaintext highlighter-rouge">PlaneRegion</code></a>s, using the same underlying memory.</p>
<p>In practice, more structures related to the encoding state are split into tiles,
provided both in <em>const</em> and <em>mut</em> versions, so there is a whole hierarchy of
tiling structures:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> +- FrameState → TileState
| +- Frame → Tile
| | +- Plane → PlaneRegion
| + RestorationState → TileRestorationState
| | +- RestorationPlane → TileRestorationPlane
| | +- FrameRestorationUnits → TileRestorationUnits
| + FrameMotionVectors → TileMotionVectors
+- FrameBlocks → TileBlocks
</code></pre></div></div>
<p>The split is done by a separate component (see <a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/tiling/tiler.rs"><code class="language-plaintext highlighter-rouge">tiler.rs</code></a>), which yields a
<em>tile context</em> containing an instance of the hierarchy of tiling views for each
tile.</p>
<h3 id="relative-offsets">Relative offsets</h3>
<p>A priori, there are mainly two possibilities to express offsets during tile
encoding:</p>
<ul>
<li>relative to the tile;</li>
<li>relative to the frame (i.e. absolute).</li>
</ul>
<p>The usage of tiling views strongly favors the first choice. For example, it
would be confusing if a bounded region could not be indexed from 0:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c">// region starting at (64, 64)</span>
<span class="k">let</span> <span class="n">row</span> <span class="o">=</span> <span class="o">&</span><span class="n">region</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span> <span class="c">// panic, out-of-bounds</span>
<span class="k">let</span> <span class="n">row</span> <span class="o">=</span> <span class="o">&</span><span class="n">region</span><span class="p">[</span><span class="mi">64</span><span class="p">];</span> <span class="c">// ok :-/</span></code></pre></figure>
<p>Worse, this would not be possible at all for the second dimension:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c">// region starting at (64, 64)</span>
<span class="k">let</span> <span class="n">first_row</span> <span class="o">=</span> <span class="o">&</span><span class="n">region</span><span class="p">[</span><span class="mi">64</span><span class="p">];</span>
<span class="k">let</span> <span class="n">first_column</span> <span class="o">=</span> <span class="n">row</span><span class="p">[</span><span class="mi">64</span><span class="p">];</span> <span class="c">// wrong, a raw slice necessarily starts at 0</span></code></pre></figure>
<p>Therefore, offsets used in tiling views are relative to the tile (contrary to
<em>libaom</em> and AV1 specification).</p>
<h2 id="tile-encoding-1">Tile encoding</h2>
<p>Encoding a frame first involves frame-wise accesses (initialization), then
tile-wise accesses (to encode tiles in parallel), then frame-wise accesses using
the results of tile-encoding (<a href="https://en.wikipedia.org/wiki/Deblocking_filter">deblocking</a>, <a href="https://hacks.mozilla.org/2018/06/av1-next-generation-video-the-constrained-directional-enhancement-filter/">CDEF</a>, <a href="https://www.youtube.com/watch?v=On9VOnIBSEs&t=1335">loop restoration</a>, …).</p>
<p>All the frame-level structures have been replaced by tiling views where
necessary.</p>
<p>The tiling views exist only temporarily, during the calls to
<a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/encoder.rs#L2113-L2122"><code class="language-plaintext highlighter-rouge">encode_tile()</code></a>. While they are alive, it is not possible to
access frame-level structures (the borrow checker statically prevents it).</p>
<p>Then the tiling structures vanish, and frame-level processing can continue.</p>
<p>This <a href="https://github.com/xiph/rav1e/pull/1126">schema</a> gives an overview:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> \
+----------------+ |
| | |
| | | Frame-wise accesses
| | >
| | | - FrameState<T>
| | | - Frame<T>
+----------------+ | - Plane<T>
/ - ...
|| tiling views
\/
\
+---+ +---+ +---+ +---+ |
| | | | | | | | | Tile encoding (possibly in parallel)
+---+ +---+ +---+ +---+ |
|
+---+ +---+ +---+ +---+ | Tile-wise accesses
| | | | | | | | >
+---+ +---+ +---+ +---+ | - TileStateMut<'_, T>
| - TileMut<'_, T>
+---+ +---+ +---+ +---+ | - PlaneRegionMut<'_, T>
| | | | | | | | |
+---+ +---+ +---+ +---+ |
/
|| vanishing of tiling views
\/
\
+----------------+ |
| | |
| | | Frame-wise accesses
| | >
| | | (deblocking, CDEF, ...)
| | |
+----------------+ |
/
</code></pre></div></div>
<h2 id="command-line">Command-line</h2>
<p>To enable tile encoding, parameters have been added to pass the (log2) number of
tiles <code class="language-plaintext highlighter-rouge">--tile-cols-log2</code> and <code class="language-plaintext highlighter-rouge">--tile-rows-log2</code>. For example, to request 2x2
tiles:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rav1e video.y4m -o video.ivf --tile-cols-log2 1 --tile-rows-log2 1
</code></pre></div></div>
<p><em>Currently, we need to pass the log2 of the number of tiles (like in libaom,
even if the <code class="language-plaintext highlighter-rouge">aomenc</code> options are called <code class="language-plaintext highlighter-rouge">--tile-columns</code> and <code class="language-plaintext highlighter-rouge">--tile-rows</code>), to
avoid any confusion. Maybe we could find a better option which is both correct,
non-confusing and user-friendly later.</em></p>
<h2 id="bitstream">Bitstream</h2>
<p>Now that we can encode tiles, we must write them according to the <a href="https://aomediacodec.github.io/av1-spec/">AV1
bitstream specification</a>, so that decoders can read the resulting file
correctly.</p>
<p>Before tile encoding (i.e. with a single tile), rav1e produced a correct
bitstream. Several changes were necessary to write multiple tiles.</p>
<h3 id="tile-info">Tile info</h3>
<p>According to <a href="https://aomediacodec.github.io/av1-spec/#tile-info-syntax">Tile info syntax</a>, the <a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/header.rs#L620-L649">frame header</a> signals the number of columns
and rows of tiles (it always signaled a single tile before).</p>
<p>In addition, when there are several tiles, it signals two more values, described
below.</p>
<h4 id="cdf-update">CDF update</h4>
<p>For <a href="https://en.wikipedia.org/wiki/Entropy_encoding">entropy coding</a>, the encoder maintain and update a CDF (Cumulative
Distribution Function), representing the probabilities of symbols.</p>
<p>After a frame is encoded, the current CDF state is saved to be possibly used as
a starting state for future frames.</p>
<p>But with tile encoding, each tile finishes with its own CDF state, so which one
should we associate to the reference frame? The answer is: any of them. But we
must signal the one we choose, in <code class="language-plaintext highlighter-rouge">context_update_tile_id</code>; the decoder needs it
to decode the bitstream.</p>
<p>In practice, we keep the CDF from the <a href="https://github.com/xiph/rav1e/commit/ec82d8016db737de51977effb7746eb1137d675b">biggest tile</a>.</p>
<h4 id="size-of-tiles-size">Size of tiles size</h4>
<p>The size of an encoded tile, in bytes, is variable (of course). Therefore, we
will need to signal the size of each tile.</p>
<p>To gain a few bytes, the number of bytes used to store the size itself is also
variable, and signaled by 2 bits in the frame header
(<code class="language-plaintext highlighter-rouge">tile_size_bytes_minus_1</code>).</p>
<p>Concretely, we must choose the <a href="https://github.com/xiph/rav1e/commit/9a76ff083d97e39b3314f36576994ea99076f996">smallest size</a> that is
sufficient to encode all the tile sizes for the frame.</p>
<h3 id="tile-group">Tile group</h3>
<p>According to <a href="https://aomediacodec.github.io/av1-spec/#general-tile-group-obu-syntax">General tile group OBU syntax</a>, we need to <a href="https://github.com/xiph/rav1e/blob/65ac94db7ba3c67c967b96e80c724e17b7414812/src/encoder.rs#L2177-L2195">signal</a>
two values when there are more than 1 tile:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">tile_start_and_end_present_flag</code> (we always disable it);</li>
<li><code class="language-plaintext highlighter-rouge">tile_size_minus_1</code>.</li>
</ul>
<p>The tile size (minus 1) is written in <a href="https://en.wikipedia.org/wiki/Endianness">little endian</a>, and use the
number of bytes we signaled in the frame header.</p>
<p>That’s all. This is sufficient to produce a correct bitstream with multiple
tiles.</p>
<h2 id="parallelization">Parallelization</h2>
<p>Thanks to <a href="https://github.com/rayon-rs/rayon">Rayon</a>, <a href="https://github.com/xiph/rav1e/commit/156cc72edf03b5605844b4ecae84dee647fda221">parallelizing</a> tile encoding is as easy as replacing
<code class="language-plaintext highlighter-rouge">iter_mut()</code> by <code class="language-plaintext highlighter-rouge">par_iter_mut()</code>.</p>
<p>I tested on my laptop (8 CPUs) several encodings to compare encoding performance
(this is not a good benchmark, but it gives an idea, you are encouraged to run
your own tests). Here are the <a href="https://github.com/xiph/rav1e/pull/1126#issuecomment-484667610">results</a>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> tiles time speedup
1 7mn02,336s 1.00×
2 3mn53,578s 1.81×
4 2mn12,995s 3.05×
8* 1mn57,533s 3.59×
</code></pre></div></div>
<p>Speedups are quite good for 2 and 4 tiles.</p>
<p>*The reason why the speedup is lower than expected for 8 tiles is that my CPU
has actually only 4 physical cores. See <a href="https://www.reddit.com/r/programming/comments/bh6sq8/implementing_tile_encoding_in_rav1e_a_rust_av1/elrl5yo/">this reddit comment</a> and
<a href="https://www.reddit.com/r/rust/comments/bh8xnl/implementing_tile_encoding_in_rav1e_a_rust_av1/elrloye/">this other one</a>.</p>
<h3 id="limits">Limits</h3>
<p>Why not 2×, 4× and 8× speedup? Mainly because of <a href="https://en.wikipedia.org/wiki/Amdahl%27s_law#Parallel_programs">Amdahl’s law</a>.</p>
<p>Tile encoding parallelizes only a part of the whole process: there are still
single-threaded processings at frame-level.</p>
<p>Suppose that a proportion <em>p</em> (between 0 and 1) of a given task can be
parallelized. Then its theoretical speedup is <code class="language-plaintext highlighter-rouge">1 / ((p/n) + (1-p))</code>, where <em>n</em>
is the number of threads.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> tiles speedup speedup speedup
(p=0.9) (p=0.95) (p=0.98)
2 1.82× 1.90× 1.96×
4 3.07× 3.48× 3.77×
8 4.71× 5.93× 7.02×
</code></pre></div></div>
<p>Maybe counterintuitively, <strong>to increase the speedup brought by parallelization,
non-parallelized code must be optimized</strong> (the more threads are used, the more
the non-parallelized code represents a significant part).</p>
<p>The (not-so-reliable) benchmark results for 2 and 4 tiles suggest that tile
encoding represents ~90% of the whole encoding process.</p>
<h2 id="fixing-bugs">Fixing bugs</h2>
<p>Not everything worked the first time.</p>
<p>The most common source of errors while implementing tile encoding was related to
offsets.</p>
<p>When there was only one tile, all offsets were relative to the frame. With
several tiles, some offsets are relative to the current tile, but some others
are still relative to the whole frame. For example, during <a href="https://en.wikipedia.org/wiki/Motion_estimation">motion estimation</a>,
a motion vector can point outside tile boundaries in the reference frame, so we
must take care to convert offsets accordingly.</p>
<p>The most obvious errors were catched by <em>plane regions</em> (which prevent access
outside boundaries), but some others were more subtle.</p>
<p>Such errors could produce interesting images. For example, here is a screenshot
of my first tiled video:</p>
<p class="center"><a href="/assets/rav1e_tile_encoding/bbb_tiling.jpg"><img src="/assets/rav1e_tile_encoding/bbb_tiling.jpg" alt="bbb" /></a></p>
<p>One of these offsets confusions had been quickly catched by <a href="https://github.com/barrbrain/">barrbrain</a> in
<a href="https://github.com/xiph/rav1e/commit/855b6d06cd2c321d50b7bab8a339c98833502bf3">intra-prediction</a>. I then fixed a similar problem in
<a href="https://github.com/xiph/rav1e/commit/bab3903425a1a9086613de5473bd4282c416c671">inter-prediction</a>.</p>
<p>But the <a href="https://en.wikipedia.org/wiki/Boss_(video_gaming)#Final_boss">final boss</a> bug was way more sneaky: it corrupted the bitstream (so the
decoder was unable to decode), but not always, and never the first frame. When
an inter-frame could be decoded, it was sometimes <a href="https://github.com/xiph/rav1e/pull/1126#issuecomment-482597763">visually corrupted</a>, but only
for some videos and for some encoding parameters.</p>
<p>After more than one week of investigations, I finally <a href="https://github.com/xiph/rav1e/commit/4984e0737984fd0d31894d5d5ebc8e89a248c3ab">found it</a>.
<code class="language-plaintext highlighter-rouge">\o/</code></p>
<h2 id="conclusion">Conclusion</h2>
<p>Implementing this feature was an awesome journey. I learned a lot, both about
Rust and video encoding (I didn’t even know what a tile was before I started).</p>
<p>Big thanks to the Mozilla/Xiph/Daala team, who has been very welcoming and
helpful, and who does amazing work!</p>
<p><em>Discuss on <a href="https://www.reddit.com/r/programming/comments/bh6sq8/implementing_tile_encoding_in_rav1e_a_rust_av1/">r/programming</a>, <a href="https://www.reddit.com/r/rust/comments/bh8xnl/implementing_tile_encoding_in_rav1e_a_rust_av1/">r/rust</a>, <a href="https://www.reddit.com/r/AV1/comments/bh8xsy/implementing_tile_encoding_in_rav1e/">r/AV1</a> and <a href="https://news.ycombinator.com/item?id=19746392">Hacker News</a>.</em></p>
Introducing scrcpy2018-03-08T12:00:00+01:00https://blog.rom1v.com/2018/03/introducing-scrcpy<p>I developed an application to display and control Android devices connected on
USB. It does not require any root access. It works on GNU/Linux, Windows and Mac
OS.</p>
<p class="center"><a href="https://github.com/Genymobile/scrcpy"><img src="/assets/scrcpy/scrcpy.jpg" alt="scrcpy" /></a></p>
<p>It focuses on:</p>
<ul>
<li><strong>lightness</strong> (native, displays only the device screen)</li>
<li><strong>performance</strong> (30~60fps)</li>
<li><strong>quality</strong> (1920×1080 or above)</li>
<li><strong>low latency</strong> (<del>70~100ms</del> <a href="https://github.com/Genymobile/scrcpy/pull/646">35~70ms</a>)</li>
<li><strong>low startup time</strong> (~1 second to display the first image)</li>
<li><strong>non-intrusiveness</strong> (nothing is left installed on the device)</li>
</ul>
<p>Like my previous project, <a href="/2017/03/introducing-gnirehtet/"><em>gnirehtet</em></a>, <a href="https://www.genymobile.com/">Genymobile</a> accepted to open source
it: <a href="https://github.com/Genymobile/scrcpy">scrcpy</a>.</p>
<p>You can <a href="https://github.com/Genymobile/scrcpy/blob/master/README.md">build, install and run</a> it.</p>
<h2 id="how-does-scrcpy-work">How does scrcpy work?</h2>
<p>The application executes a server on the device. The client and the server
communicate via a socket over an <em>adb tunnel</em>.</p>
<p>The server streams an <a href="https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC">H.264</a> video of the device screen. The client decodes the
video frames and displays them.</p>
<p>The client captures input (keyboard and mouse) events, sends them to the server,
which injects them to the device.</p>
<p>The <a href="https://github.com/Genymobile/scrcpy/blob/master/DEVELOP.md">documentation</a> gives more details.</p>
<p>Here, I will detail several technical aspects of the application likely to
interest developers.</p>
<h2 id="minimize-latency">Minimize latency</h2>
<h3 id="no-buffering">No buffering</h3>
<p>It takes time to encode, transmit and decode the video stream. To minimize
latency, we must avoid any additional delay.</p>
<p>For example, let’s stream the screen with <code class="language-plaintext highlighter-rouge">screenrecord</code> and play it with VLC:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>adb exec-out screenrecord --output-format=h264 - | vlc - --demux h264
</code></pre></div></div>
<p>Initially, it works, but quickly the latency increases and frames are broken.
The reason is that VLC associates a <a href="https://en.wikipedia.org/wiki/Presentation_timestamp">PTS</a> to frames, and buffers the stream to
play frames at some target time.</p>
<p>As a consequence, it sometimes prints such errors on <em>stderr</em>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ES_OUT_SET_(GROUP_)PCR is called too late (pts_delay increased to 300 ms)
</code></pre></div></div>
<p>Just before I started the project, Philippe, a colleague who played with
<a href="https://en.wikipedia.org/wiki/WebRTC">WebRTC</a>, advised me to “manually” decode (using <em>FFmpeg</em>) and render frames, to
avoid any additional latency. This saved me from wasting time, it was the right
solution.</p>
<p><a href="https://github.com/Genymobile/scrcpy/blob/master/DEVELOP.md#decoder">Decoding</a> the video stream to retrieve individual frames <a href="https://www.ffmpeg.org/doxygen/3.4/group__lavc__encdec.html">with
FFmpeg</a> is rather <a href="https://github.com/Genymobile/scrcpy/blob/v1.0/app/src/decoder.c#L94-L110">straightforward</a>.</p>
<h3 id="skip-frames">Skip frames</h3>
<p>If, for any reason, the rendering is delayed, decoded frames are dropped so that
<em>scrcpy</em> always displays the last decoded frame.</p>
<p>Note that this behavior may be changed with a <a href="https://github.com/Genymobile/scrcpy/blob/v1.0/app/meson.build#L81">configuration flag</a>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mesonconf x -Dskip_frames=false
</code></pre></div></div>
<h2 id="run-a-java-main-on-android">Run a Java main on Android</h2>
<p>Capturing the device screen requires some privileges, which are granted to
<code class="language-plaintext highlighter-rouge">shell</code>.</p>
<p>It is possible to execute Java code as <code class="language-plaintext highlighter-rouge">shell</code> on Android, by invoking
<code class="language-plaintext highlighter-rouge">app_process</code> from <code class="language-plaintext highlighter-rouge">adb shell</code>.</p>
<h3 id="hello-world">Hello, world!</h3>
<p>Here is a simple Java application:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">class</span> <span class="nc">HelloWorld</span> <span class="o">{</span>
<span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="nc">String</span><span class="o">...</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Hello, world!"</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span></code></pre></figure>
<p>Let’s compile and <em>dex</em> it:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>javac -source 1.7 -target 1.7 HelloWorld.java
"$ANDROID_HOME"/build-tools/27.0.2/dx \
--dex --output classes.dex HelloWorld.class
</code></pre></div></div>
<p>Then, we push <code class="language-plaintext highlighter-rouge">classes.dex</code> to an Android device:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>adb push classes.dex /data/local/tmp/
</code></pre></div></div>
<p>And execute it:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ adb shell CLASSPATH=/data/local/tmp/classes.dex app_process / HelloWorld
Hello, world!
</code></pre></div></div>
<h3 id="access-the-android-framework">Access the Android framework</h3>
<p>The application can access the Android framework at runtime.</p>
<p>For example, let’s use <code class="language-plaintext highlighter-rouge">android.os.SystemClock</code>:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="kn">import</span> <span class="nn">android.os.SystemClock</span><span class="o">;</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">HelloWorld</span> <span class="o">{</span>
<span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="nc">String</span><span class="o">...</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">print</span><span class="o">(</span><span class="s">"Hello,"</span><span class="o">);</span>
<span class="nc">SystemClock</span><span class="o">.</span><span class="na">sleep</span><span class="o">(</span><span class="mi">1000</span><span class="o">);</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">" world!"</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span></code></pre></figure>
<p>We link our class against <code class="language-plaintext highlighter-rouge">android.jar</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>javac -source 1.7 -target 1.7 \
-cp "$ANDROID_HOME"/platforms/android-27/android.jar
HelloWorld.java
</code></pre></div></div>
<p>Then run it as before.</p>
<p><em>Note that scrcpy also needs to access <a href="https://github.com/Genymobile/scrcpy/blob/master/DEVELOP.md#hidden-methods">hidden methods</a> from the framework. In
that case, linking against <code class="language-plaintext highlighter-rouge">android.jar</code> is not sufficient, so it uses
<a href="https://en.wikipedia.org/wiki/Reflection_(computer_programming)">reflection</a>.</em></p>
<h3 id="like-an-apk">Like an APK</h3>
<p>The execution also works if <code class="language-plaintext highlighter-rouge">classes.dex</code> is embedded in a zip/jar:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>jar cvf hello.jar classes.dex
adb push hello.jar /data/local/tmp/
adb shell CLASSPATH=/data/local/tmp/hello.jar app_process / HelloWorld
</code></pre></div></div>
<p>You know an example of a zip containing <code class="language-plaintext highlighter-rouge">classes.dex</code>? An <a href="https://en.wikipedia.org/wiki/Android_application_package">APK</a>!</p>
<p>Therefore, it works for any installed APK containing a class with a main method:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ adb install myapp.apk
…
$ adb shell pm path my.app.package
package:/data/app/my.app.package-1/base.apk
$ adb shell CLASSPATH=/data/app/my.app.package-1/base.apk \
app_process / HelloWorld
</code></pre></div></div>
<h3 id="in-scrcpy">In scrcpy</h3>
<p>To simplify the build system, I decided to build the server as an APK using
<a href="https://gradle.org/">gradle</a>, even if it’s not a real Android application: <em>gradle</em> provides tasks
for running tests, checking style, etc.</p>
<p>Invoked that way, the server is authorized to capture the device screen.</p>
<h2 id="improve-startup-time">Improve startup time</h2>
<h3 id="quick-installation">Quick installation</h3>
<p>Nothing is required to be installed on the device by the user: at startup, the
client is responsible for executing the server on the device.</p>
<p>We saw that we can execute the main method of the server from an APK either:</p>
<ul>
<li>installed, or</li>
<li>pushed to <code class="language-plaintext highlighter-rouge">/data/local/tmp</code>.</li>
</ul>
<p>Which one to choose?</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ time adb install server.apk
…
real 0m0,963s
…
$ time adb push server.apk /data/local/tmp/
…
real 0m0,022s
…
</code></pre></div></div>
<p>So I decided to push.</p>
<p><em>Note that <code class="language-plaintext highlighter-rouge">/data/local/tmp</code> is readable and writable by <code class="language-plaintext highlighter-rouge">shell</code>, but not
world-writable, so a malicious application may not replace the server just
before the client executes it.</em></p>
<h3 id="parallelization">Parallelization</h3>
<p>If you executed the <em>Hello, world!</em> in the previous section, you may have
noticed that running <code class="language-plaintext highlighter-rouge">app_process</code> takes some time: <code class="language-plaintext highlighter-rouge">Hello, World!</code> is not
printed before some delay (between 0.5 and 1 second).</p>
<p>In the client, initializing SDL also takes some time.</p>
<p>Therefore, these initialization steps <a href="https://github.com/Genymobile/scrcpy/commit/90a46b4c45637d083e877020d85ade52a9a5fa8e">have been parallelized</a>.</p>
<h2 id="clean-up-the-device">Clean up the device</h2>
<p>After usage, we want to remove the server (<code class="language-plaintext highlighter-rouge">/data/local/tmp/scrcpy-server.jar</code>)
from the device.</p>
<p>We could remove it on exit, but then, it would be left on device disconnection.</p>
<p>Instead, once the server is opened by <code class="language-plaintext highlighter-rouge">app_process</code>, <em>scrcpy</em> <a href="http://man7.org/linux/man-pages/man2/unlink.2.html">unlink</a>s (<code class="language-plaintext highlighter-rouge">rm</code>)
it. Thus, the file is present only for less than 1 second (it is removed even
before the screen is displayed).</p>
<p>The file itself (not its name) is actually removed when the last associated open file
descriptor is closed (at the latest, when <code class="language-plaintext highlighter-rouge">app_process</code> dies).</p>
<h2 id="handle-text-input">Handle text input</h2>
<p>Handling input received from a keyboard is more complicated than I thought.</p>
<h3 id="events">Events</h3>
<p>There are 2 kinds of “keyboard” events:</p>
<ul>
<li><a href="https://wiki.libsdl.org/SDL_KeyboardEvent">key</a> events,</li>
<li><a href="https://wiki.libsdl.org/SDL_TextInputEvent">text input</a> events.</li>
</ul>
<p>Key events <a href="https://wiki.libsdl.org/CategoryKeyboard">provide</a> both the <em>scancode</em> (the physical location of a
key on the keyboard) and the <em>keycode</em> (which depends on the keyboard layout).
Only <em>keycodes</em> are used by <em>scrcpy</em> (it doesn’t need the location of physical
keys).</p>
<p>However, key events are not sufficient to handle <a href="https://wiki.libsdl.org/Tutorials/TextInput">text
input</a>:</p>
<blockquote>
<p>Sometimes it can take multiple key presses to produce a character. Sometimes a
single key press can produce multiple characters.</p>
</blockquote>
<p>Even simple characters may not be handled easily with key events, since they
depend on the layout. For example, on a French keyboard, typing <code class="language-plaintext highlighter-rouge">.</code> <em>(dot)</em>
generates <code class="language-plaintext highlighter-rouge">Shift</code>+<code class="language-plaintext highlighter-rouge">;</code>.</p>
<p>Therefore, <em>scrcpy</em> forwards key events to the device only for a <a href="https://github.com/Genymobile/scrcpy/blob/v1.0/app/src/convert.c#L75-L87">limited set of
keys</a>. The remaining are handled by <em>text input</em> events.</p>
<h3 id="inject-text">Inject text</h3>
<p>On the Android side, we may not inject text directly (injecting a <a href="https://developer.android.com/reference/android/view/KeyEvent.html"><code class="language-plaintext highlighter-rouge">KeyEvent</code></a>
created by <a href="https://developer.android.com/reference/android/view/KeyEvent.html#KeyEvent(long,%20java.lang.String,%20int,%20int)">the relevant constructor</a> does not work).
Instead, we can retrieve a list of <code class="language-plaintext highlighter-rouge">KeyEvent</code>s to generate for a <code class="language-plaintext highlighter-rouge">char[]</code>, using
<a href="https://developer.android.com/reference/android/view/KeyCharacterMap.html#getEvents(char[])"><code class="language-plaintext highlighter-rouge">getEvents(char[])</code></a>.</p>
<p>For example:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="kt">char</span><span class="o">[]</span> <span class="n">chars</span> <span class="o">=</span> <span class="o">{</span><span class="sc">'?'</span><span class="o">};</span>
<span class="nc">KeyEvent</span><span class="o">[]</span> <span class="n">events</span> <span class="o">=</span> <span class="n">charMap</span><span class="o">.</span><span class="na">getEvents</span><span class="o">(</span><span class="n">chars</span><span class="o">);</span></code></pre></figure>
<p>Here, <code class="language-plaintext highlighter-rouge">events</code> is initialized with an array of 4 events:</p>
<ol>
<li>press <code class="language-plaintext highlighter-rouge">KEYCODE_SHIFT_LEFT</code></li>
<li>press <code class="language-plaintext highlighter-rouge">KEYCODE_SLASH</code></li>
<li>release <code class="language-plaintext highlighter-rouge">KEYCODE_SLASH</code></li>
<li>release <code class="language-plaintext highlighter-rouge">KEYCODE_SHIFT_LEFT</code></li>
</ol>
<p><a href="https://github.com/Genymobile/scrcpy/blob/v1.0/server/src/main/java/com/genymobile/scrcpy/EventController.java#L103-L107">Injecting</a> those events correctly generates the char <code class="language-plaintext highlighter-rouge">'?'</code>.</p>
<h3 id="handle-accented-characters">Handle accented characters</h3>
<p>Unfortunately, the previous method only works for ASCII characters:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="kt">char</span><span class="o">[]</span> <span class="n">chars</span> <span class="o">=</span> <span class="o">{</span><span class="sc">'é'</span><span class="o">};</span>
<span class="nc">KeyEvent</span><span class="o">[]</span> <span class="n">events</span> <span class="o">=</span> <span class="n">charMap</span><span class="o">.</span><span class="na">getEvents</span><span class="o">(</span><span class="n">chars</span><span class="o">);</span>
<span class="c1">// events is null!!!</span></code></pre></figure>
<p>I first thought there was no way to inject such events from there, until I
discussed with Philippe (yes, the same as earlier), who knew the solution: it
works when we decompose the characters using <a href="https://source.android.com/devices/input/key-character-map-files#behaviors">combining diacritical dead key
characters</a>.</p>
<p>Concretely, instead of injecting <code class="language-plaintext highlighter-rouge">"é"</code>, we inject <code class="language-plaintext highlighter-rouge">"\u0301e"</code>:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="kt">char</span><span class="o">[]</span> <span class="n">chars</span> <span class="o">=</span> <span class="o">{</span><span class="sc">'\u0301'</span><span class="o">,</span> <span class="sc">'e'</span><span class="o">};</span>
<span class="nc">KeyEvent</span><span class="o">[]</span> <span class="n">events</span> <span class="o">=</span> <span class="n">charMap</span><span class="o">.</span><span class="na">getEvents</span><span class="o">(</span><span class="n">chars</span><span class="o">);</span>
<span class="c1">// now, there are events</span></code></pre></figure>
<p>Therefore, to support accented characters, <em>scrcpy</em> attempts to <a href="https://github.com/Genymobile/scrcpy/blob/v1.0/server/src/main/java/com/genymobile/scrcpy/EventController.java#L97">decompose</a> the
characters using <a href="https://github.com/Genymobile/scrcpy/blob/v1.0/server/src/main/java/com/genymobile/scrcpy/KeyComposition.java"><code class="language-plaintext highlighter-rouge">KeyComposition</code></a>.</p>
<h2 id="set-a-window-icon">Set a window icon</h2>
<p>The application window may have an icon, used in the title bar (for some desktop
environments) and/or in the desktop taskbar.</p>
<p>The window icon must be set from an <a href="https://wiki.libsdl.org/SDL_Surface"><code class="language-plaintext highlighter-rouge">SDL_Surface</code></a> by <a href="https://wiki.libsdl.org/SDL_SetWindowIcon"><code class="language-plaintext highlighter-rouge">SDL_SetWindowIcon</code></a>.
Creating the surface with the icon content is up to the developer. For exemple,
we could decide to load the icon from a PNG file, or directly from its raw
pixels in memory.</p>
<p>Instead, another colleague, <a href="http://agateau.com/">Aurélien</a>, suggested I use the <a href="https://en.wikipedia.org/wiki/X_PixMap">XPM</a> image format,
which is also a valid C source code: <a href="https://github.com/Genymobile/scrcpy/blob/v1.0/app/src/icon.xpm"><code class="language-plaintext highlighter-rouge">icon.xpm</code></a>.</p>
<p>Note that the image is not the content of the variable <code class="language-plaintext highlighter-rouge">icon_xpm</code> declared in
<code class="language-plaintext highlighter-rouge">icon.xpm</code>: it’s the whole file! Thus, <code class="language-plaintext highlighter-rouge">icon.xpm</code> may be both directly opened in
<a href="https://www.gimp.org/">Gimp</a> and included in C source code:</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="cp">#include "icon.xpm"</span></code></pre></figure>
<p>As a benefit, we directly “recognize” the icon from the source code, and we can
patch it easily: in debug mode, the <a href="https://github.com/Genymobile/scrcpy/blob/v1.0/app/src/tinyxpm.c#L34-L37">icon color</a> is changed.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Developing this project was an awesome and motivating experience. I’ve learned a
lot (I never used <em>SDL</em> or <em>libav/FFmpeg</em> before).</p>
<p>The resulting application works better than I initially expected, and I’m happy
to have been able to open source it.</p>
<p><em>Discuss on <a href="https://www.reddit.com/r/Android/comments/834zmr/introducing_scrcpy_an_app_to_display_and_control/">reddit</a> and <a href="https://news.ycombinator.com/item?id=16544977">Hacker News</a>.</em></p>
Gnirehtet rewritten in Rust2017-09-21T17:00:00+02:00https://blog.rom1v.com/2017/09/gnirehtet-rewritten-in-rust<p>Several months ago, I introduced <a href="/2017/03/introducing-gnirehtet/">Gnirehtet</a>, a reverse
tethering tool for Android I wrote in Java.</p>
<p>Since then, <strong>I rewrote it in <a href="https://www.rust-lang.org/">Rust</a></strong>.</p>
<p>And it’s also open source! <a href="https://github.com/Genymobile/gnirehtet">Download it</a>, plug an Android device,
and execute:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./gnirehtet run
</code></pre></div></div>
<p><em>(adb must be installed)</em></p>
<ul id="markdown-toc">
<li><a href="#why-rust" id="markdown-toc-why-rust">Why Rust?</a></li>
<li><a href="#learning-rust" id="markdown-toc-learning-rust">Learning Rust</a></li>
<li><a href="#difficulties" id="markdown-toc-difficulties">Difficulties</a> <ul>
<li><a href="#encapsulation" id="markdown-toc-encapsulation">Encapsulation</a></li>
<li><a href="#observer" id="markdown-toc-observer">Observer</a></li>
<li><a href="#mutable-data-sharing" id="markdown-toc-mutable-data-sharing">Mutable data sharing</a></li>
<li><a href="#compiler-limitations" id="markdown-toc-compiler-limitations">Compiler limitations</a></li>
</ul>
</li>
<li><a href="#safety-pitfalls" id="markdown-toc-safety-pitfalls">Safety pitfalls</a> <ul>
<li><a href="#leakpocalypse" id="markdown-toc-leakpocalypse">Leakpocalypse</a></li>
<li><a href="#undefined-infinity" id="markdown-toc-undefined-infinity">Undefined infinity</a></li>
<li><a href="#segfault" id="markdown-toc-segfault">Segfault</a></li>
</ul>
</li>
<li><a href="#stats" id="markdown-toc-stats">Stats</a> <ul>
<li><a href="#number-of-lines" id="markdown-toc-number-of-lines">Number of lines</a></li>
<li><a href="#binary-size" id="markdown-toc-binary-size">Binary size</a></li>
<li><a href="#memory-usage" id="markdown-toc-memory-usage">Memory usage</a></li>
<li><a href="#cpu-usage" id="markdown-toc-cpu-usage">CPU usage</a></li>
</ul>
</li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
<h2 id="why-rust">Why Rust?</h2>
<p>At Genymobile, we wanted <em>Gnirehtet</em> not to require the <em>Java Runtime
Environment</em>, so the main requirement was to compile the application to a
<em>native</em> executable binary.</p>
<p>Therefore, I first considered rewriting it in C or C++. But at that time (early
May), I was interested in learning Rust, after vaguely hearing what it
<a href="https://blog.rust-lang.org/2017/02/06/roadmap.html">provided</a>, namely:</p>
<ul>
<li><a href="https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html">memory safety without garbage collection</a>,</li>
<li><a href="https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html">concurrency without data races</a>,</li>
<li><a href="https://blog.rust-lang.org/2015/05/11/traits.html">abstraction without overhead</a>.</li>
</ul>
<p>However, I had never written a line of Rust code nor heard about Rust
<a href="https://doc.rust-lang.org/book/first-edition/ownership.html">ownership</a>, <a href="https://doc.rust-lang.org/book/first-edition/references-and-borrowing.html">borrowing</a> or <a href="https://doc.rust-lang.org/book/first-edition/lifetimes.html">lifetimes</a>.</p>
<p>But I am convinced that the best way to learn a programming language is to work
full-time on a real project in that language.</p>
<p>I was motivated, so after checking that it could fit our requirements
(basically, I wrote a sample using the <a href="https://en.wikipedia.org/wiki/Asynchronous_I/O">async I/O</a> library <a href="https://crates.io/crates/mio">mio</a>, and
executed it on both Linux and Windows), I decided to rewrite <em>Gnirehtet</em> in
Rust.</p>
<h2 id="learning-rust">Learning Rust</h2>
<p>During the rewriting, I <em>devoured</em> successively the <a href="https://doc.rust-lang.org/book/first-edition/">Rust book</a>, <a href="https://rustbyexample.com/">Rust by
example</a> and the <a href="https://doc.rust-lang.org/nomicon/">Rustonomicon</a>. I learned a lot, and <em>Rust</em> is an awesome
language. I now miss many of its features when I work on a C++ project,
including:</p>
<ul>
<li><a href="https://rustbyexample.com/cast/inference.html">advanced type inference</a>,</li>
<li><a href="https://doc.rust-lang.org/book/first-edition/enums.html">enums</a>,</li>
<li><a href="https://doc.rust-lang.org/book/first-edition/patterns.html">patterns</a>,</li>
<li><a href="https://doc.rust-lang.org/book/first-edition/traits.html">trait bounds</a>,</li>
<li><a href="https://doc.rust-lang.org/std/option/"><code class="language-plaintext highlighter-rouge">Option<T></code></a> (like <a href="https://github.com/tvaneerd/cpp17_in_TTs/blob/master/ALL_IN_ONE.md#stdoptionalt"><code class="language-plaintext highlighter-rouge">std::optional<T></code></a> in C++17, but benefiting from enums
and patterns),</li>
<li><a href="https://doc.rust-lang.org/book/first-edition/macros.html">hygienic macros</a>,</li>
<li>the absence of header files,</li>
<li>the (so simple) build system, <em>and of course</em></li>
<li>guaranteed memory safety.</li>
</ul>
<p>About learning, Paul Graham <a href="http://paulgraham.com/know.html">wrote</a>:</p>
<blockquote>
<p><strong>Reading and experience train your model of the world.</strong> And even if you
forget the experience or what you read, its effect on your model of the world
persists. Your mind is like a compiled program you’ve lost the source of. It
works, but you don’t know why.</p>
</blockquote>
<p>Some of Rust concepts (like <a href="https://doc.rust-lang.org/book/first-edition/lifetimes.html">lifetimes</a> or <a href="https://doc.rust-lang.org/book/first-edition/ownership.html#move-semantics">move semantics</a> by default)
provided a significantly different new <em>training set</em> which definitely affected
my model of the world (of programming).</p>
<p>I am not going to present all these features (just click on the links to the
documentation if you are interested). Instead, I will try to explain where and
why Rust resisted to the design I wanted to implement, and how to rethink the
problems within Rust constraints.</p>
<p><em>The following part requires some basic knowledge of Rust. You may want to skip
directly to the <a href="#stats">stats</a>.</em></p>
<h2 id="difficulties">Difficulties</h2>
<p>The <a href="https://github.com/Genymobile/gnirehtet/blob/master/DEVELOP.md#relay-server">design</a> of the Java application was pretty effective, so I wanted to
reproduce the global architecture in the Rust version (with adaptations to make
it more <em>Rust</em> idiomatic if necessary).</p>
<p>But I struggled on the details, especially to make the <a href="https://doc.rust-lang.org/book/first-edition/references-and-borrowing.html"><em>borrow
checker</em></a> happy. The <a href="https://doc.rust-lang.org/book/first-edition/references-and-borrowing.html#the-rules">rules</a> are simple:</p>
<blockquote>
<p>First, any borrow must last for a scope no greater than that of the owner.
Second, you may have one or the other of these two kinds of borrows, but not
both at the same time:</p>
<ul>
<li>one or more references (<code class="language-plaintext highlighter-rouge">&T</code>) to a resource,</li>
<li>exactly one mutable reference (<code class="language-plaintext highlighter-rouge">&mut T</code>).</li>
</ul>
</blockquote>
<p>However, it took me some time to realize how they conflict with some patterns or
principles.</p>
<p>Here are my feedbacks. I selected 4 subjects which are general enough to be
independent of this particular project:</p>
<ul>
<li>the conflicts with <a href="#encapsulation">encapsulation</a>;</li>
<li>the <a href="#observer">observer</a> pattern;</li>
<li>how to <a href="#mutable-data-sharing">share mutable data</a>;</li>
<li>a quick note about annoying <a href="#compiler-limitations">compiler limitations</a>.</li>
</ul>
<h3 id="encapsulation">Encapsulation</h3>
<p><strong>The borrowing rules constrain encapsulation.</strong> This was the first consequence
I realized.</p>
<p>Here is a canonical sample:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Data</span> <span class="p">{</span>
<span class="n">header</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">4</span><span class="p">],</span>
<span class="n">payload</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">20</span><span class="p">],</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="n">Self</span> <span class="p">{</span>
<span class="n">Self</span> <span class="p">{</span>
<span class="n">header</span><span class="p">:</span> <span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">4</span><span class="p">],</span>
<span class="n">payload</span><span class="p">:</span> <span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">20</span><span class="p">],</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">header</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">]</span> <span class="p">{</span>
<span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.header</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">payload</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">]</span> <span class="p">{</span>
<span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.payload</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nn">Data</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">let</span> <span class="n">header</span> <span class="o">=</span> <span class="n">data</span><span class="nf">.header</span><span class="p">();</span>
<span class="k">let</span> <span class="n">payload</span> <span class="o">=</span> <span class="n">data</span><span class="nf">.payload</span><span class="p">();</span>
<span class="p">}</span></code></pre></figure>
<p>We just create a new instance of <code class="language-plaintext highlighter-rouge">Data</code>, then bind mutable references to the
<code class="language-plaintext highlighter-rouge">header</code> and <code class="language-plaintext highlighter-rouge">payload</code> arrays to local variables, through accessors.</p>
<p>However, this does not compile:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc sample.rs
error[E0499]: cannot borrow `data` as mutable more than once at a time
--> sample.rs:21:19
|
25 | let header = data.header();
| ---- first mutable borrow occurs here
26 | let payload = data.payload();
| ^^^^ second mutable borrow occurs here
27 | }
| - first borrow ends here
</code></pre></div></div>
<p>The compiler may not assume that <code class="language-plaintext highlighter-rouge">header()</code> and <code class="language-plaintext highlighter-rouge">payload()</code> return references to
disjoint data in the <code class="language-plaintext highlighter-rouge">Data</code> struct. Therefore, each one borrows the whole <code class="language-plaintext highlighter-rouge">data</code>
structure. Since the borrowing rules forbid to get two mutables references to
the same resource, it rejects the second call.</p>
<p>Sometimes, we face temporary limitations because the compiler is not smart
enough (yet). This is not the case here: the implementation of <code class="language-plaintext highlighter-rouge">header()</code> might
actually return a reference to <code class="language-plaintext highlighter-rouge">payload</code>, or write to the <code class="language-plaintext highlighter-rouge">payload</code> array,
violating the borrowing rules. And the validity of a method call may not depend
on the method implementation.</p>
<p>To fix the problem, the compiler must be able to know that the local variables
<code class="language-plaintext highlighter-rouge">header</code> and <code class="language-plaintext highlighter-rouge">payload</code> reference <strong>disjoint data</strong>, for example by accessing
the fields directly:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">let</span> <span class="n">header</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="n">data</span><span class="py">.header</span><span class="p">;</span>
<span class="k">let</span> <span class="n">payload</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="n">data</span><span class="py">.payload</span><span class="p">;</span></code></pre></figure>
<p>or by exposing a method providing both references simultaneously:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">header_and_payload</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">])</span> <span class="p">{</span>
<span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.header</span><span class="p">,</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.payload</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nn">Data</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">let</span> <span class="p">(</span><span class="n">header</span><span class="p">,</span> <span class="n">payload</span><span class="p">)</span> <span class="o">=</span> <span class="n">data</span><span class="nf">.header_and_payload</span><span class="p">();</span>
<span class="p">}</span></code></pre></figure>
<p>Similarly, inside a struct implementation, the borrowing rules also prevent
factoring code into a private method easily. Consider this (artificial) example:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Data</span> <span class="p">{</span>
<span class="n">buf</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">20</span><span class="p">],</span>
<span class="n">prefix_length</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span>
<span class="n">sum</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">port</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_sum</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="o">&</span><span class="k">self</span><span class="py">.buf</span><span class="p">[</span><span class="k">self</span><span class="py">.prefix_length</span><span class="o">..</span><span class="p">];</span>
<span class="k">self</span><span class="py">.sum</span> <span class="o">=</span> <span class="n">content</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.cloned</span><span class="p">()</span><span class="nf">.map</span><span class="p">(</span><span class="nn">u32</span><span class="p">::</span><span class="n">from</span><span class="p">)</span><span class="nf">.sum</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_port</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="o">&</span><span class="k">self</span><span class="py">.buf</span><span class="p">[</span><span class="k">self</span><span class="py">.prefix_length</span><span class="o">..</span><span class="p">];</span>
<span class="k">self</span><span class="py">.port</span> <span class="o">=</span> <span class="p">(</span><span class="n">content</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">)</span> <span class="o"><<</span> <span class="mi">8</span> <span class="p">|</span> <span class="n">content</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Here, the <code class="language-plaintext highlighter-rouge">buf</code> field is an array storing some prefix and content contiguously.</p>
<p>We want to factorize the way we retrieve the <code class="language-plaintext highlighter-rouge">content</code> slice, so that
the <code class="language-plaintext highlighter-rouge">update_*()</code> methods are not bothered with the details. Let’s try:</p>
<figure class="highlight"><pre><code class="language-diff" data-lang="diff"> impl Data {
pub fn update_sum(&mut self) {
<span class="gd">- let content = &self.buf[self.prefix_length..];
</span><span class="gi">+ let content = self.content();
</span> self.sum = content.iter().cloned().map(u32::from).sum();
}
pub fn update_port(&mut self) {
<span class="gd">- let content = &self.buf[self.prefix_length..];
</span><span class="gi">+ let content = self.content();
</span> self.port = (content[2] as u16) << 8 | content[3] as u16;
}
<span class="gi">+
+ fn content(&mut self) -> &[u8] {
+ &self.buf[self.prefix_length..]
+ }
</span> }</code></pre></figure>
<p>Unfortunately, this does not compile:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0506]: cannot assign to `self.sum` because it is borrowed
--> facto2.rs:11:9
|
10 | let content = self.content();
| ---- borrow of `self.sum` occurs here
11 | self.sum = content.iter().cloned().map(u32::from).sum();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ assignment to borrowed `self.sum` occurs here
error[E0506]: cannot assign to `self.port` because it is borrowed
--> facto2.rs:16:9
|
15 | let content = self.content();
| ---- borrow of `self.port` occurs here
16 | self.port = (content[2] as u16) << 8 & content[3] as u16;
|
</code></pre></div></div>
<p>As in the previous exemple, retrieving the reference through a method borrows
the whole struct (here, <code class="language-plaintext highlighter-rouge">self</code>).</p>
<p>To workaround the problem, we can explain to the compiler that the fields are
disjoint:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_sum</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nn">Self</span><span class="p">::</span><span class="nf">content</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="py">.buf</span><span class="p">,</span> <span class="k">self</span><span class="py">.prefix_length</span><span class="p">);</span>
<span class="k">self</span><span class="py">.sum</span> <span class="o">=</span> <span class="n">content</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.cloned</span><span class="p">()</span><span class="nf">.map</span><span class="p">(</span><span class="nn">u32</span><span class="p">::</span><span class="n">from</span><span class="p">)</span><span class="nf">.sum</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_port</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nn">Self</span><span class="p">::</span><span class="nf">content</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="py">.buf</span><span class="p">,</span> <span class="k">self</span><span class="py">.prefix_length</span><span class="p">);</span>
<span class="k">self</span><span class="py">.port</span> <span class="o">=</span> <span class="p">(</span><span class="n">content</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">)</span> <span class="o"><<</span> <span class="mi">8</span> <span class="p">|</span> <span class="n">content</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">content</span><span class="p">(</span><span class="n">buf</span><span class="p">:</span> <span class="o">&</span><span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="n">prefix_length</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span> <span class="p">{</span>
<span class="o">&</span><span class="n">buf</span><span class="p">[</span><span class="n">prefix_length</span><span class="o">..</span><span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>This compiles, but totally defeats the purpose of factorization: the caller has
to provide the necessary fields.</p>
<p>As an alternative, we can use a <a href="https://doc.rust-lang.org/book/first-edition/macros.html">macro</a> to <em>inline</em> the code:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">macro_rules!</span> <span class="n">content</span> <span class="p">{</span>
<span class="p">(</span><span class="nv">$self:ident</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span>
<span class="o">&</span><span class="nv">$self</span><span class="py">.buf</span><span class="p">[</span><span class="nv">$self</span><span class="py">.prefix_length</span><span class="o">..</span><span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_sum</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nd">content!</span><span class="p">(</span><span class="k">self</span><span class="p">);</span>
<span class="k">self</span><span class="py">.sum</span> <span class="o">=</span> <span class="n">content</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.cloned</span><span class="p">()</span><span class="nf">.map</span><span class="p">(</span><span class="nn">u32</span><span class="p">::</span><span class="n">from</span><span class="p">)</span><span class="nf">.sum</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_port</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nd">content!</span><span class="p">(</span><span class="k">self</span><span class="p">);</span>
<span class="k">self</span><span class="py">.port</span> <span class="o">=</span> <span class="p">(</span><span class="n">content</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">)</span> <span class="o"><<</span> <span class="mi">8</span> <span class="p">|</span> <span class="n">content</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>But this seems far from ideal.</p>
<p>I think we must just live with it: encapsulation sometimes conflicts with the
borrowing rules. After all, this is not so surprising: enforcing the borrowing
rules requires to follow every concrete access to resources, while
encapsulation aims to abstract them away.</p>
<h3 id="observer">Observer</h3>
<p>The <a href="https://en.wikipedia.org/wiki/Observer_pattern"><em>observer</em> pattern</a> is useful for registering event listeners on
an object.</p>
<p>In some cases, <strong>this pattern may not be straightforward to implement in Rust</strong>.</p>
<p>For simplicity, let’s consider that the events are <code class="language-plaintext highlighter-rouge">u32</code> values. Here
is a possible implementation:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">trait</span> <span class="n">EventListener</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">on_event</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">event</span><span class="p">:</span> <span class="nb">u32</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Notifier</span> <span class="p">{</span>
<span class="n">listeners</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">Box</span><span class="o"><</span><span class="n">EventListener</span><span class="o">>></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Notifier</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="n">Self</span> <span class="p">{</span>
<span class="n">Self</span> <span class="p">{</span> <span class="n">listeners</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span> <span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="n">register</span><span class="o"><</span><span class="n">T</span><span class="p">:</span> <span class="n">EventListener</span> <span class="o">+</span> <span class="nv">'static</span><span class="o">></span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">listener</span><span class="p">:</span> <span class="n">T</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.listeners</span><span class="nf">.push</span><span class="p">(</span><span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">listener</span><span class="p">));</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">notify</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">event</span><span class="p">:</span> <span class="nb">u32</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">listener</span> <span class="n">in</span> <span class="o">&</span><span class="k">self</span><span class="py">.listeners</span> <span class="p">{</span>
<span class="n">listener</span><span class="nf">.on_event</span><span class="p">(</span><span class="n">event</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>For convenience, make closures implement our <code class="language-plaintext highlighter-rouge">EventListener</code> trait:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="n">F</span><span class="p">:</span> <span class="nf">Fn</span><span class="p">(</span><span class="nb">u32</span><span class="p">)</span><span class="o">></span> <span class="n">EventListener</span> <span class="k">for</span> <span class="n">F</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">on_event</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">event</span><span class="p">:</span> <span class="nb">u32</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="p">(</span><span class="n">event</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Thus, its usage is simple:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">let</span> <span class="k">mut</span> <span class="n">notifier</span> <span class="o">=</span> <span class="nn">Notifier</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="n">notifier</span><span class="nf">.register</span><span class="p">(|</span><span class="n">event</span><span class="p">|</span> <span class="nd">println!</span><span class="p">(</span><span class="s">"received [{}]"</span><span class="p">,</span> <span class="n">event</span><span class="p">));</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"notifying..."</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span></code></pre></figure>
<p>This prints:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>notifying...
received [42]
</code></pre></div></div>
<p>So far, so good.</p>
<p>However, things get a bit more complicated if we want to mutate a state when an
event is received. For example, let’s implement a struct storing all the events
we received:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="n">events</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">u32</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="n">Self</span> <span class="p">{</span>
<span class="n">Self</span> <span class="p">{</span> <span class="n">events</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span> <span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">store</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">value</span><span class="p">:</span> <span class="nb">u32</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.events</span><span class="nf">.push</span><span class="p">(</span><span class="n">value</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">events</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="nb">Vec</span><span class="o"><</span><span class="nb">u32</span><span class="o">></span> <span class="p">{</span>
<span class="o">&</span><span class="k">self</span><span class="py">.events</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>To be able to fill this <code class="language-plaintext highlighter-rouge">Storage</code> on each event received, we somehow have to
pass it along with the event listener, which will be stored in the <code class="language-plaintext highlighter-rouge">Notifier</code>.
Therefore, we need a single instance of <code class="language-plaintext highlighter-rouge">Storage</code> to be <strong>shared</strong> between the
caller code and the <code class="language-plaintext highlighter-rouge">Notifier</code>.</p>
<p>Holding two mutable references to the same object obviously violates the
borrowing rules, so we need a <a href="https://doc.rust-lang.org/std/rc/">reference-counting pointer</a>.</p>
<p>However, such a pointer is read-only, so we also need a <a href="https://doc.rust-lang.org/std/cell/index.html"><code class="language-plaintext highlighter-rouge">RefCell</code></a>
for <a href="https://ricardomartins.cc/2016/06/08/interior-mutability">interior mutability</a>.</p>
<p>Thus, we will use an instance of <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code>. It may seem too
verbose, but using <code class="language-plaintext highlighter-rouge">Rc<RefCell<T>></code> (or <code class="language-plaintext highlighter-rouge">Arc<Mutex<T>></code> for thread-safety) is
very common in Rust. And <a href="https://www.reddit.com/r/rust/comments/33jv62/vecrcrefcellboxtrait_is_there_a_better_way/">there is worse</a>.</p>
<p>Here is the resulting client code:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">cell</span><span class="p">::</span><span class="n">RefCell</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">rc</span><span class="p">::</span><span class="nb">Rc</span><span class="p">;</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">notifier</span> <span class="o">=</span> <span class="nn">Notifier</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="c">// first Rc to the Storage</span>
<span class="k">let</span> <span class="n">rc</span> <span class="o">=</span> <span class="nn">Rc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">RefCell</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Storage</span><span class="p">::</span><span class="nf">new</span><span class="p">()));</span>
<span class="c">// second Rc to the Storage</span>
<span class="k">let</span> <span class="n">rc2</span> <span class="o">=</span> <span class="n">rc</span><span class="nf">.clone</span><span class="p">();</span>
<span class="c">// register the listener saving all the received events to the Storage</span>
<span class="n">notifier</span><span class="nf">.register</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">event</span><span class="p">|</span> <span class="n">rc2</span><span class="nf">.borrow_mut</span><span class="p">()</span><span class="nf">.store</span><span class="p">(</span><span class="n">event</span><span class="p">));</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">141</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">59</span><span class="p">);</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="o">&</span><span class="nd">vec!</span><span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">141</span><span class="p">,</span> <span class="mi">59</span><span class="p">],</span> <span class="n">rc</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.events</span><span class="p">());</span></code></pre></figure>
<p>That way, the <code class="language-plaintext highlighter-rouge">Storage</code> is correctly mutated from the event listener.</p>
<p>All is not solved, though. In this example, we had access to the
<code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code> instance. What if we only have access to the <code class="language-plaintext highlighter-rouge">Storage</code>,
e.g. if we want <code class="language-plaintext highlighter-rouge">Storage</code> to register itself from one of its methods, without
requiring the caller to provide the <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code> instance?</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">register_to</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">notifier</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">Notifier</span><span class="p">)</span> <span class="p">{</span>
<span class="n">notifier</span><span class="nf">.register</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">event</span><span class="p">|</span> <span class="p">{</span>
<span class="cm">/* how to retrieve a &mut Storage from here? */</span>
<span class="p">});</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>We need to retrieve the <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code> from the <code class="language-plaintext highlighter-rouge">Storage</code> in some way.</p>
<p>To do so, the idea consists in making the <code class="language-plaintext highlighter-rouge">Storage</code> aware of its
reference-counting pointer. <em>Of course, this only makes sense if <code class="language-plaintext highlighter-rouge">Storage</code> is
constructed inside a <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code>.</em></p>
<p>This is exactly what <a href="http://en.cppreference.com/w/cpp/memory/enable_shared_from_this"><code class="language-plaintext highlighter-rouge">enable_shared_from_this</code></a> provides in C++, so we can draw
inspiration from <a href="https://stackoverflow.com/a/34062114/1987178">how it works</a>: just store a
<code class="language-plaintext highlighter-rouge">Weak<RefCell<…>></code>, <a href="https://doc.rust-lang.org/std/rc/struct.Rc.html#method.downgrade">downgraded</a> from the <code class="language-plaintext highlighter-rouge">Rc<RefCell<…>></code>, into the structure
itself. That way, we can use it to get a <code class="language-plaintext highlighter-rouge">&mut Storage</code> reference back in the
event listener:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">rc</span><span class="p">::{</span><span class="nb">Rc</span><span class="p">,</span> <span class="n">Weak</span><span class="p">};</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">cell</span><span class="p">::</span><span class="n">RefCell</span><span class="p">;</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="n">self_weak</span><span class="p">:</span> <span class="n">Weak</span><span class="o"><</span><span class="n">RefCell</span><span class="o"><</span><span class="n">Storage</span><span class="o">>></span><span class="p">,</span>
<span class="n">events</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">u32</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="nb">Rc</span><span class="o"><</span><span class="n">RefCell</span><span class="o"><</span><span class="n">Self</span><span class="o">>></span> <span class="p">{</span>
<span class="k">let</span> <span class="n">rc</span> <span class="o">=</span> <span class="nn">Rc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">RefCell</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">Self</span> <span class="p">{</span>
<span class="n">self_weak</span><span class="p">:</span> <span class="nn">Weak</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span> <span class="c">// initialize empty</span>
<span class="n">events</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span>
<span class="p">}));</span>
<span class="c">// set self_weak once we get the Rc instance</span>
<span class="n">rc</span><span class="nf">.borrow_mut</span><span class="p">()</span><span class="py">.self_weak</span> <span class="o">=</span> <span class="nn">Rc</span><span class="p">::</span><span class="nf">downgrade</span><span class="p">(</span><span class="o">&</span><span class="n">rc</span><span class="p">);</span>
<span class="n">rc</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">register_to</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">notifier</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">Notifier</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">rc</span> <span class="o">=</span> <span class="k">self</span><span class="py">.self_weak</span><span class="nf">.upgrade</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">();</span>
<span class="n">notifier</span><span class="nf">.register</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">event</span><span class="p">|</span> <span class="n">rc</span><span class="nf">.borrow_mut</span><span class="p">()</span><span class="nf">.store</span><span class="p">(</span><span class="n">event</span><span class="p">))</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Here is how to use it:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">let</span> <span class="k">mut</span> <span class="n">notifier</span> <span class="o">=</span> <span class="nn">Notifier</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">let</span> <span class="n">rc</span> <span class="o">=</span> <span class="nn">Storage</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="n">rc</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.register_to</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">notifier</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">141</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">59</span><span class="p">);</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="o">&</span><span class="nd">vec!</span><span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">141</span><span class="p">,</span> <span class="mi">59</span><span class="p">],</span> <span class="n">rc</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.events</span><span class="p">());</span></code></pre></figure>
<p>So it is possible to implement the <em>observer</em> pattern in Rust, but this is a bit
more challenging than in Java ;-)</p>
<p>When possible, it might be preferable to avoid it.</p>
<h3 id="mutable-data-sharing">Mutable data sharing</h3>
<blockquote>
<p>Mutable references cannot be <a href="https://doc.rust-lang.org/nomicon/references.html">aliased</a>.</p>
</blockquote>
<p>How to share mutable data, then?</p>
<p>We saw that we can use <code class="language-plaintext highlighter-rouge">Rc<RefCell<…>></code> (or <code class="language-plaintext highlighter-rouge">Arc<Mutex<…>></code>), that enforces the
borrowing rules at runtime. However, this is not always desirable:</p>
<ul>
<li>it forces a new allocation on the heap,</li>
<li>each access has a runtime cost,</li>
<li>it always borrows the whole resource.</li>
</ul>
<p>Alternatively, we could use <a href="https://doc.rust-lang.org/book/first-edition/raw-pointers.html">raw pointers</a> manually inside <a href="https://doc.rust-lang.org/book/first-edition/unsafe.html">unsafe</a> code, but
then this would be <em>unsafe</em>.</p>
<p>And there is another way, which consists in exposing <strong>temporary borrowing
views</strong> of an object. Let me explain.</p>
<p>In <em>Gnirehtet</em>, a packet contains a reference to the raw data (stored in some
buffer elsewhere) along with the <a href="https://en.wikipedia.org/wiki/IPv4#Packet_structure">IP</a> and <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_segment_structure">TCP</a>/<a href="https://en.wikipedia.org/wiki/User_Datagram_Protocol#Packet_structure">UDP</a> header fields values
(parsed from the raw data). We could have used a flat structure to store
everything:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">ipv4_source</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">ipv4_destination</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">ipv4_protocol</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="c">// + other ipv4 fields</span>
<span class="n">transport_source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="n">transport_destination</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="c">// + other transport fields</span>
<span class="p">}</span></code></pre></figure>
<p>The <code class="language-plaintext highlighter-rouge">Packet</code> would provide <em>setters</em> for all the header fields (updating both
the packet fields and the raw data). For example:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">set_transport_source</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">transport_source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.transport_source</span> <span class="o">=</span> <span class="n">transport_source</span><span class="p">;</span>
<span class="k">let</span> <span class="n">transport</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.raw</span><span class="p">[</span><span class="mi">20</span><span class="o">..</span><span class="p">];</span>
<span class="nn">BigEndian</span><span class="p">::</span><span class="nf">write_u16</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">transport</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">2</span><span class="p">],</span> <span class="n">port</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>But this would be poor design (especially since TCP and UDP header fields
are different).</p>
<p>Instead, we would like to extract IP and transport headers to separate structs,
managing their own part of the raw data:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c">// violates the borrowing rules</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="c">// the whole packet (including headers)</span>
<span class="n">ipv4_header</span><span class="p">:</span> <span class="n">Ipv4Header</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">,</span>
<span class="n">transport_header</span><span class="p">:</span> <span class="n">TransportHeader</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Ipv4Header</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="c">// slice related to ipv4 headers</span>
<span class="n">source</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">destination</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">protocol</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="c">// + other ipv4 fields</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">TransportHeader</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="c">// slice related to transport headers</span>
<span class="n">source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="n">destination</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="c">// + other transport fields</span>
<span class="p">}</span></code></pre></figure>
<p>You immediately spotted the problem: <strong>there are several references to the
same resource, the <code class="language-plaintext highlighter-rouge">raw</code> byte array, at the same time</strong>.</p>
<p><em>Note that <a href="https://doc.rust-lang.org/std/primitive.slice.html#method.split_at_mut">splitting</a> the array is not a possibility here, since the <code class="language-plaintext highlighter-rouge">raw</code>
slices overlap: we need to write the whole packet at once to the network, so the
<code class="language-plaintext highlighter-rouge">raw</code> array in <code class="language-plaintext highlighter-rouge">Packet</code> must include the headers.</em></p>
<p>We need a solution compatible with the borrowing rules.</p>
<p>Here is the one I came up with:</p>
<ul>
<li>store the header data separately, without the <code class="language-plaintext highlighter-rouge">raw</code> slices,</li>
<li>create <em>view</em> structs for IP and transport headers, with <a href="https://doc.rust-lang.org/book/first-edition/lifetimes.html#in-structs">lifetime bounds</a>,</li>
<li>expose <code class="language-plaintext highlighter-rouge">Packet</code> methods returning <em>view</em> instances.</li>
</ul>
<p>And here is a simplification of the actual implementation:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">ipv4_header</span><span class="p">:</span> <span class="n">Ipv4HeaderData</span><span class="p">,</span>
<span class="n">transport_header</span><span class="p">:</span> <span class="n">TransportHeaderData</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Ipv4HeaderData</span> <span class="p">{</span>
<span class="n">source</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">destination</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">protocol</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="c">// + other ipv4 fields</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">TransportHeaderData</span> <span class="p">{</span>
<span class="n">source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="n">destination</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="c">// + other transport fields</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Ipv4Header</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="n">Ipv4HeaderData</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">TransportHeader</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="n">TransportHeaderData</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">ipv4_header</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">Ipv4Header</span> <span class="p">{</span>
<span class="n">Ipv4Header</span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.raw</span><span class="p">[</span><span class="o">..</span><span class="mi">20</span><span class="p">],</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.ipv4_header</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">transport_header</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">TransportHeader</span> <span class="p">{</span>
<span class="n">TransportHeader</span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.raw</span><span class="p">[</span><span class="mi">20</span><span class="o">..</span><span class="mi">40</span><span class="p">],</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.transport_header</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>The <em>setters</em> are implemented on the views, where they hold a mutable reference
to the raw array:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="n">TransportHeader</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">set_source</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.data.source</span> <span class="o">=</span> <span class="n">source</span><span class="p">;</span>
<span class="nn">BigEndian</span><span class="p">::</span><span class="nf">write_u16</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">raw</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">2</span><span class="p">],</span> <span class="n">source</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">set_destination</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">destination</span><span class="p">:</span> <span class="nb">u16</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.data.destination</span> <span class="o">=</span> <span class="n">destination</span><span class="p">;</span>
<span class="nn">BigEndian</span><span class="p">::</span><span class="nf">write_u16</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">raw</span><span class="p">[</span><span class="mi">2</span><span class="o">..</span><span class="mi">4</span><span class="p">],</span> <span class="n">destination</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>That way, the borrowing rules are respected, and the API is elegant:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">let</span> <span class="k">mut</span> <span class="n">packet</span> <span class="o">=</span> <span class="err">…</span><span class="p">;</span>
<span class="c">// "transport_header" borrows "packet" during its scope</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">transport_header</span> <span class="o">=</span> <span class="n">packet</span><span class="nf">.transport_header</span><span class="p">();</span>
<span class="n">transport_header</span><span class="nf">.set_source</span><span class="p">(</span><span class="mi">1234</span><span class="p">);</span>
<span class="n">transport_header</span><span class="nf">.set_destination</span><span class="p">(</span><span class="mi">1234</span><span class="p">);</span></code></pre></figure>
<h3 id="compiler-limitations">Compiler limitations</h3>
<p>Rust is a young language, and the compiler has some annoying pitfalls.</p>
<p>The worst, in my opinion, is related to <a href="http://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/#problem-case-2-conditional-control-flow">non-lexical lifetimes</a>, which leads to
<a href="https://stackoverflow.com/questions/44417491/non-lexical-lifetime-workaround-failure">unexpected errors</a>:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Container</span> <span class="p">{</span>
<span class="n">vec</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">i32</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Container</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">find</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">v</span><span class="p">:</span> <span class="nb">i32</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Option</span><span class="o"><&</span><span class="k">mut</span> <span class="nb">i32</span><span class="o">></span> <span class="p">{</span>
<span class="nb">None</span> <span class="c">// we don't care the implementation</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">get</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">v</span><span class="p">:</span> <span class="nb">i32</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="k">mut</span> <span class="nb">i32</span> <span class="p">{</span>
<span class="k">if</span> <span class="k">let</span> <span class="nf">Some</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.find</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">x</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">self</span><span class="py">.vec</span><span class="nf">.push</span><span class="p">(</span><span class="n">v</span><span class="p">);</span>
<span class="k">self</span><span class="py">.vec</span><span class="nf">.last_mut</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0499]: cannot borrow `self.vec` as mutable more than once at a time
--> sample.rs:14:9
|
11 | if let Some(x) = self.find(v) {
| ---- first mutable borrow occurs here
...
14 | self.vec.push(v);
| ^^^^^^^^ second mutable borrow occurs here
15 | self.vec.last_mut().unwrap()
16 | }
| - first borrow ends here
</code></pre></div></div>
<p>Hopefully, <a href="http://smallcultfollowing.com/babysteps/blog/2017/07/11/non-lexical-lifetimes-draft-rfc-and-prototype-available/">it should be fixed soon</a>.</p>
<p>The <a href="https://github.com/rust-lang/rfcs/blob/master/text/1522-conservative-impl-trait.md"><em>Impl Trait</em></a> feature, allowing to return <em>unboxed</em> abstract
types from functions, should also improve the experience (there is also an
<a href="https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md">expanded</a> proposal).</p>
<p>The compiler generally produces very helpful error messages. But when it does
not, they can be very <a href="https://stackoverflow.com/questions/44003622/implementing-trait-for-fnsomething-in-rust">confusing</a>.</p>
<h2 id="safety-pitfalls">Safety pitfalls</h2>
<p>The <a href="https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html">first chapter of the <em>Rustonomicon</em></a> says:</p>
<blockquote>
<p>Safe Rust is For Reals Totally Safe.</p>
<p>[…]</p>
<p>Safe Rust is the true Rust programming language. If all you do is write Safe
Rust, you will never have to worry about type-safety or memory-safety. You
will never endure a null or dangling pointer, or any of that Undefined
Behavior nonsense.</p>
</blockquote>
<p>That’s the goal. And that’s <em>almost</em> true.</p>
<h3 id="leakpocalypse">Leakpocalypse</h3>
<p>In the past, it was <a href="https://github.com/rust-lang/rust/issues/24292">possible</a> to write <em>safe-Rust</em> code <strong>accessing
freed memory</strong>.</p>
<p>This “<a href="http://cglab.ca/~abeinges/blah/everyone-poops/">leakpocalypse</a>” led to a <a href="https://github.com/alexcrichton/rfcs/blob/safe-mem-forget/text/0000-safe-mem-forget.md">clarification</a> of the safety
guarantees: not running a destructor is now <a href="https://github.com/rust-lang/rfcs/pull/1066">considered <em>safe</em></a>. In
other words, <strong>memory-safety may not rely on <a href="https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization">RAII</a></strong> anymore (in fact, it never
could, but it has been noticed only belatedly).</p>
<p>As a consequence, <a href="https://doc.rust-lang.org/std/mem/fn.forget.html"><code class="language-plaintext highlighter-rouge">std::mem::forget</code></a> is now <em>safe</em>, and <a href="https://doc.rust-lang.org/1.0.0/std/thread/struct.JoinGuard.html"><code class="language-plaintext highlighter-rouge">JoinGuard</code></a> has been
deprecated and removed from the standard library (it has been moved to a
<a href="http://arcnmx.github.io/thread-scoped-rs/thread_scoped/">separate crate</a>).</p>
<p>Other tools relying on RAII (like <a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.drain"><code class="language-plaintext highlighter-rouge">Vec::drain()</code></a>) must <a href="https://github.com/rust-lang/rust/blob/1.20.0/src/liballoc/vec.rs#L1094-L1102">take special care</a> to
prevent memory corruption.</p>
<p>Whew, <em>memory-safety</em> is (now) safe.</p>
<h3 id="undefined-infinity">Undefined infinity</h3>
<p>In C and C++, <a href="https://stackoverflow.com/questions/3592557/optimizing-away-a-while1-in-c0x">infinite loops</a> without side-effects are <a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm">undefined
behavior</a>. This makes it possible to write programs that unexpectedly
<a href="https://blog.regehr.org/archives/140">disprove Fermat’s Last Theorem</a>.</p>
<p>In practice, the Rust compiler relies on LLVM, which (currently) applies its
optimizations assuming that infinite loops without side-effects are <em>undefined
behavior</em>. As a consequence, such <em>undefined behaviors</em> also occur in Rust.</p>
<p>Here is a minimal sample to trigger it:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">infinite</span><span class="p">()</span> <span class="p">{</span>
<span class="k">loop</span> <span class="p">{}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="nf">infinite</span><span class="p">();</span>
<span class="p">}</span></code></pre></figure>
<p>Running without optimizations, it behaves as “expected”:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc ub.rs && ./ub
^C (infinite loop, interrupt it)
</code></pre></div></div>
<p>Enabling optimizations makes the program panic:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc -O ub.rs && ./ub
thread 'main' panicked at 'assertion failed: c.borrow().is_none()', /checkout/src/libstd/sys_common/thread_info.rs:51
note: Run with `RUST_BACKTRACE=1` for a backtrace.
</code></pre></div></div>
<p>Alternatively, we can produce unexpected results without crashing:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">infinite</span><span class="p">(</span><span class="k">mut</span> <span class="n">value</span><span class="p">:</span> <span class="nb">u32</span><span class="p">)</span> <span class="p">{</span>
<span class="c">// infinite loop unless value initially equals 0</span>
<span class="k">while</span> <span class="n">value</span> <span class="o">!=</span> <span class="mi">0</span> <span class="p">{</span>
<span class="k">if</span> <span class="n">value</span> <span class="o">!=</span> <span class="mi">1</span> <span class="p">{</span>
<span class="n">value</span> <span class="o">-=</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="nf">infinite</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"end"</span><span class="p">);</span>
<span class="p">}</span></code></pre></figure>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc ub.rs && ./ub
^C (infinite loop, interrupt it)
</code></pre></div></div>
<p>But with optimizations:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc -O ub.rs && ./ub
end
</code></pre></div></div>
<p>This is a corner case, that will probably be solved in the future. In practice,
<strong>Rust safety guarantees are pretty strong</strong> (at a cost of being constraining).</p>
<h3 id="segfault">Segfault</h3>
<p><em>This section has been added after the publication.</em></p>
<p>There are other sources of <em>undefined behaviors</em> (look at the <a href="https://github.com/rust-lang/rust/labels/I-unsound">issues tagged
<em>I-unsound</em></a>).</p>
<p>For instance, casting a <em>float</em> value that cannot fit into the target type is
<em>undefined behavior</em>, which can be <a href="https://github.com/rust-lang/rust/issues/10184#issuecomment-139858153">propagated</a> to trigger a segfault:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">#[inline(never)]</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">f</span><span class="p">(</span><span class="n">ary</span><span class="p">:</span> <span class="o">&</span><span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">5</span><span class="p">])</span> <span class="k">-></span> <span class="o">&</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">idx</span> <span class="o">=</span> <span class="mf">1e100f64</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">;</span>
<span class="o">&</span><span class="n">ary</span><span class="p">[</span><span class="n">idx</span><span class="o">..</span><span class="p">]</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"{}"</span><span class="p">,</span> <span class="nf">f</span><span class="p">(</span><span class="o">&</span><span class="p">[</span><span class="mi">1</span><span class="p">;</span> <span class="mi">5</span><span class="p">])[</span><span class="mi">0xdeadbeef</span><span class="p">]);</span>
<span class="p">}</span></code></pre></figure>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rustc -O ub.rs && ./ub
Segmentation fault
</code></pre></div></div>
<h2 id="stats">Stats</h2>
<p>That’s all for my feedbacks about the language itself.</p>
<p>As an appendix, let’s compare the current <em>Java</em> and <em>Rust</em> versions of the
relay server.</p>
<h3 id="number-of-lines">Number of lines</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cloc relay-{java,rust}/src
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Rust 29 687 655 4506
Java 37 726 701 2931
-------------------------------------------------------------------------------
</code></pre></div></div>
<p><em>(tests included)</em></p>
<p>The Rust project is significantly bigger, for several reasons:</p>
<ul>
<li>there are many <a href="#mutable-data-sharing"><em>borrowing views</em></a> classes;</li>
<li>the Rust version contains its own <em>selector</em> class, wrapping the lower-level
<a href="https://docs.rs/mio/0.6.10/mio/struct.Poll.html"><code class="language-plaintext highlighter-rouge">Poll</code></a>, while the Java version uses the standard
<a href="https://docs.oracle.com/javase/8/docs/api/java/nio/channels/Selector.html"><code class="language-plaintext highlighter-rouge">Selector</code></a>;</li>
<li>the <a href="https://doc.rust-lang.org/book/first-edition/error-handling.html">error handling</a> for command-line parsing is more verbose.</li>
</ul>
<p>The Java version has more files because the unit tests are separate, while in
Rust they are in the same file as the classes they test.</p>
<p>Just for information, here are the results for the Android client:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cloc app/src
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Java 15 198 321 875
XML 6 7 2 76
-------------------------------------------------------------------------------
SUM: 21 205 323 951
-------------------------------------------------------------------------------
</code></pre></div></div>
<h3 id="binary-size">Binary size</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>--------------------------------------------
Java gnirehtet.jar 61K
--------------------------------------------
Rust gnirehtet 3.0M
after "strip -g gnirehtet" 747K
after "strip gnirehtet" 588K
--------------------------------------------
</code></pre></div></div>
<p>The Java binary itself is far smaller. The comparison is not fair though, since
it requires the <em>Java Runtime Environment</em>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ du -sh /usr/lib/jvm/java-1.8.0-openjdk-amd64/
156M /usr/lib/jvm/java-1.8.0-openjdk-amd64/
</code></pre></div></div>
<h3 id="memory-usage">Memory usage</h3>
<p>With a single TCP connection opened, here is the memory consumption for the Java
relay server:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo pmap -x $RELAY_JAVA_PID
Kbytes RSS Dirty
total kB 4364052 86148 69316
</code></pre></div></div>
<p><em>(output filtered)</em></p>
<p>And for the Rust relay server:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo pmap -x $RELAY_RUST_PID
Kbytes RSS Dirty
total kB 19272 2736 640
</code></pre></div></div>
<p><em>Look at the <a href="https://en.wikipedia.org/wiki/Resident_set_size">RSS</a> value, which indicates the actual memory used.</em></p>
<p>As expected, the Java version consumes more memory (86Mb) than the Rust one
(less than 3Mb). Moreover, its value is unstable due to the allocation of tiny
objects and their <a href="https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)">garbage collection</a>, which also generates more dirty pages.
On the contrary, the Rust value is very stable: once the connection is created,
there are no memory allocations <em>at all</em>.</p>
<h3 id="cpu-usage">CPU usage</h3>
<p>To compare CPU usage, here is my scenario: a 500Mb file is hosted by Apache on
my laptop, I start the relay server through <code class="language-plaintext highlighter-rouge">perf stat</code>, then I download the
file from Firefox on Android. As soon as the file is downloaded, I stop the
relay server (Ctrl+C).</p>
<p>Here are the results for the Java version:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ perf stat -B java -jar gnirehtet.jar relay
Performance counter stats for 'java -jar gnirehtet.jar relay':
11805,458302 task-clock:u (msec) # 0,088 CPUs utilized
0 context-switches:u # 0,000 K/sec
0 cpu-migrations:u # 0,000 K/sec
28 618 page-faults:u # 0,002 M/sec
17 908 360 446 cycles:u # 1,517 GHz
13 944 172 792 stalled-cycles-frontend:u # 77,86% frontend cycles idle
18 437 279 663 instructions:u # 1,03 insn per cycle
# 0,76 stalled cycles per insn
3 088 215 431 branches:u # 261,592 M/sec
70 647 760 branch-misses:u # 2,29% of all branches
133,975117164 seconds time elapsed
</code></pre></div></div>
<p>And for the Rust version:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ perf stat -B ./gnirehtet relay
Performance counter stats for 'target/release/gnirehtet relay':
2707,479968 task-clock:u (msec) # 0,020 CPUs utilized
0 context-switches:u # 0,000 K/sec
0 cpu-migrations:u # 0,000 K/sec
1 001 page-faults:u # 0,370 K/sec
1 011 527 340 cycles:u # 0,374 GHz
2 033 810 378 stalled-cycles-frontend:u # 201,06% frontend cycles idle
981 103 003 instructions:u # 0,97 insn per cycle
# 2,07 stalled cycles per insn
98 929 222 branches:u # 36,539 M/sec
3 220 527 branch-misses:u # 3,26% of all branches
133,766035253 seconds time elapsed
</code></pre></div></div>
<p>I am not an expert in analyzing the results, but as far as I understand from
the <code class="language-plaintext highlighter-rouge">task-clock:u</code> value, the Rust version consumes 4× less CPU-time.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Rewriting <em>Gnirehtet</em> in Rust was an amazing experience, where I learnt a great
language and new programming concepts. And now, we get a native application
showing better performances.</p>
<p>Happy reverse tethering!</p>
<p><em>Discuss on <a href="https://www.reddit.com/r/rust/comments/71ks57/gnirehtet_a_reverse_tethering_tool_for_android/">reddit</a> and <a href="https://news.ycombinator.com/item?id=15326106">Hacker News</a>.</em></p>
Gnirehtet réécrit en Rust2017-09-21T17:00:00+02:00https://blog.rom1v.com/2017/09/gnirehtet-reecrit-en-rust<p>Il y a quelques mois, j’ai présenté <a href="/2017/03/gnirehtet/">Gnirehtet</a>, un outil de
<em>reverse tethering</em> pour Android que j’ai écrit en Java.</p>
<p>Depuis, <strong>je l’ai réécrit en <a href="https://www.rust-lang.org/">Rust</a></strong>.</p>
<p>Et il est également open source ! <a href="https://github.com/Genymobile/gnirehtet">Téléchargez-le</a>, branchez un
téléphone ou une tablette Android, et exécutez :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./gnirehtet run
</code></pre></div></div>
<p><em>(adb doit être installé)</em></p>
<ul id="markdown-toc">
<li><a href="#pourquoi-rust" id="markdown-toc-pourquoi-rust">Pourquoi Rust?</a></li>
<li><a href="#apprendre-rust" id="markdown-toc-apprendre-rust">Apprendre Rust</a></li>
<li><a href="#difficultés" id="markdown-toc-difficultés">Difficultés</a> <ul>
<li><a href="#encapsulation" id="markdown-toc-encapsulation">Encapsulation</a></li>
<li><a href="#observateur" id="markdown-toc-observateur">Observateur</a></li>
<li><a href="#partage-de-données-mutables" id="markdown-toc-partage-de-données-mutables">Partage de données mutables</a></li>
<li><a href="#limitations-du-compilateur" id="markdown-toc-limitations-du-compilateur">Limitations du compilateur</a></li>
</ul>
</li>
<li><a href="#sûreté-et-pièges" id="markdown-toc-sûreté-et-pièges">Sûreté et pièges</a> <ul>
<li><a href="#leakpocalypse" id="markdown-toc-leakpocalypse">Leakpocalypse</a></li>
<li><a href="#infinité-indéfinie" id="markdown-toc-infinité-indéfinie">Infinité indéfinie</a></li>
<li><a href="#erreur-de-segmentation" id="markdown-toc-erreur-de-segmentation">Erreur de segmentation</a></li>
</ul>
</li>
<li><a href="#stats" id="markdown-toc-stats">Stats</a> <ul>
<li><a href="#nombre-de-lignes" id="markdown-toc-nombre-de-lignes">Nombre de lignes</a></li>
<li><a href="#taille-des-binaires" id="markdown-toc-taille-des-binaires">Taille des binaires</a></li>
<li><a href="#utilisation-mémoire" id="markdown-toc-utilisation-mémoire">Utilisation mémoire</a></li>
<li><a href="#utilisation-cpu" id="markdown-toc-utilisation-cpu">Utilisation CPU</a></li>
</ul>
</li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
<h2 id="pourquoi-rust">Pourquoi Rust?</h2>
<p>À Genymobile, nous voulions que <em>Gnirehtet</em> ne nécessite pas d’environnement
d’exécution Java (JRE), donc le besoin principal était de compiler l’application
vers un binaire exécutable <em>natif</em>.</p>
<p>Par conséquent, j’ai d’abord pensé la réécrire en C ou C++. Mais à ce moment-là
(début mai), apprendre Rust m’intéressait, après avoir vaguement entendu parler
de ses fonctionnalités:</p>
<ul>
<li><a href="https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html">sûreté mémoire sans <em>garbage collection</em></a>,</li>
<li><a href="https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html">concurrence sans <em>data races</em></a>,</li>
<li><a href="https://blog.rust-lang.org/2015/05/11/traits.html">abstraction sans coût</a>.</li>
</ul>
<p>Cependant, je n’avais jamais écrit une ligne de Rust ni entendu parler de son
système de <a href="https://doc.rust-lang.org/book/first-edition/ownership.html">possession</a>, d’<a href="https://doc.rust-lang.org/book/first-edition/references-and-borrowing.html">emprunt</a> ou de <a href="https://doc.rust-lang.org/book/first-edition/lifetimes.html">durées de
vie</a>.</p>
<p>Mais je suis convaincu que le meilleur moyen d’apprendre un langage de
programmation est de travailler à plein temps sur un projet dans ce langage.</p>
<p>J’étais motivé, donc après avoir vérifié que ça pouvait convenir (en gros, j’ai
écrit un exemple utilisant la bibliothèque d’<a href="https://en.wikipedia.org/wiki/Asynchronous_I/O">I/O asynchrone</a> <a href="https://crates.io/crates/mio">mio</a>,
et je l’ai exécuté à la fois sur Linux et Windows), j’ai décidé de réécrire
<em>Gnirehtet</em> en Rust.</p>
<h2 id="apprendre-rust">Apprendre Rust</h2>
<p>Pendant la réécriture, j’ai <em>dévoré</em> successivement le <a href="https://doc.rust-lang.org/book/first-edition/">Rust book</a>, <a href="https://rustbyexample.com/">Rust by
example</a> et le <a href="https://doc.rust-lang.org/nomicon/">Rustonomicon</a>. J’ai beaucoup appris, et j’aime énormément ce
langage. Beaucoup de ses fonctionnalités me manquent maintenant quand je
travaille sur un projet C++ :</p>
<ul>
<li><a href="https://rustbyexample.com/cast/inference.html">inférence de type avancée</a>,</li>
<li><a href="https://doc.rust-lang.org/book/first-edition/enums.html">enums</a>,</li>
<li><a href="https://doc.rust-lang.org/book/first-edition/patterns.html">patterns</a>,</li>
<li><a href="https://doc.rust-lang.org/book/first-edition/traits.html">trait bounds</a>,</li>
<li><a href="https://doc.rust-lang.org/std/option/"><code class="language-plaintext highlighter-rouge">Option<T></code></a> (comme <a href="https://github.com/tvaneerd/cpp17_in_TTs/blob/master/ALL_IN_ONE.md#stdoptionalt"><code class="language-plaintext highlighter-rouge">std::optional<T></code></a> en C++17, mais tirant bénéfice des
enums et des patterns),</li>
<li><a href="https://doc.rust-lang.org/book/first-edition/macros.html">macros hygiéniques</a>,</li>
<li>l’absence de fichiers d’en-têtes,</li>
<li>le (si simple) système de <em>build</em>, <em>et bien sûr</em></li>
<li>la garantie de sûreté mémoire.</li>
</ul>
<p>À propos de l’apprentissage, Paul Graham <a href="http://paulgraham.com/know.html">a écrit</a>:</p>
<blockquote>
<p><strong>Reading and experience train your model of the world.</strong> And even if you
forget the experience or what you read, its effect on your model of the world
persists. Your mind is like a compiled program you’ve lost the source of. It
works, but you don’t know why.</p>
</blockquote>
<p>Pour les non-anglophones, ma propre traduction :</p>
<blockquote>
<p><strong>La lecture et l’expérience entraînent votre modèle du monde.</strong> Et même si
vous oubliez l’expérience ou ce que vous avez lu, son effet sur votre modèle
du monde persiste. Votre esprit est comme un programme compilé dont vous
auriez perdu le code source. Ça fonctionne, mais vous ne savez pas pourquoi.</p>
</blockquote>
<p>Certains des concepts de Rust (comme les <a href="https://doc.rust-lang.org/book/first-edition/lifetimes.html">durées de vie</a> ou la
<a href="https://doc.rust-lang.org/book/first-edition/ownership.html#move-semantics">sémantique de mouvement</a> par défaut) m’ont fourni un <em>jeu de
données</em> significativement différent, qui a sans aucun doute affecté mon modèle
du monde (de la programmation).</p>
<p>Je ne vais pas présenter toutes ces fonctionnaliés (cliquez sur les liens de la
documentation si ça vous intéresse). À la place, je vais essayer d’expliquer où
et pourquoi Rust a resisté au <em>design</em> que je voulais implémenter, et comment
repenser les problèmes dans le périmètre des contraintes de Rust.</p>
<p><em>La partie suivant nécessite une certaine connaissance de Rust. Vous pourriez
vouloir la passer pour aller directement aux <a href="#stats">stats</a>.</em></p>
<h2 id="difficultés">Difficultés</h2>
<p>Je trouvais la conception de l’application Java plutôt réussie, donc je voulais
reproduire l’architecture globale dans la version Rust (avec d’éventuelles
adaptations pour la <em>rustifier</em>).</p>
<p>Mais j’ai lutté sur les détails, en particulier pour satisfaire le <a href="https://doc.rust-lang.org/book/first-edition/references-and-borrowing.html"><em>borrow
checker</em></a>. Les <a href="https://doc.rust-lang.org/book/first-edition/references-and-borrowing.html#the-rules">règles</a> sont simples:</p>
<blockquote>
<p>First, any borrow must last for a scope no greater than that of the owner.
Second, you may have one or the other of these two kinds of borrows, but not
both at the same time:</p>
<ul>
<li>one or more references (<code class="language-plaintext highlighter-rouge">&T</code>) to a resource,</li>
<li>exactly one mutable reference (<code class="language-plaintext highlighter-rouge">&mut T</code>).</li>
</ul>
</blockquote>
<p>En français :</p>
<blockquote>
<p>Premièrement, aucun emprunt ne doit avoir une portée plus grande que celle de
son propriétaire.
Deuxièmement, vous pouvez avoir l’un ou l’autre de ces types d’emprunts, mais
pas les deux à la fois:</p>
<ul>
<li>une ou plusieurs références (<code class="language-plaintext highlighter-rouge">&T</code>) vers une ressource,</li>
<li>exactement une référence mutable (<code class="language-plaintext highlighter-rouge">&mut T</code>).</li>
</ul>
</blockquote>
<p>Cependant, il m’a fallu un peu de temps pour réaliser comment elles entrent en
conflit avec certains principes ou modèles de conception.</p>
<p>Voici donc mes retours. J’ai sélectionné 4 sujets qui sont suffisamment généraux
pour être indépendants de ce projet particulier :</p>
<ul>
<li>les conflits avec l’<a href="#encapsulation">encapsulation</a> ;</li>
<li>le <em>design pattern</em> <a href="#observateur">observateur</a> ;</li>
<li>comment <a href="#partage-de-donnes-mutables">partager des données mutables</a> ;</li>
<li>un retour rapide sur les <a href="#limitations-du-compilateur">limitations ennuyeuses du
compilateur</a>.</li>
</ul>
<h3 id="encapsulation">Encapsulation</h3>
<p><strong>Les règles d’emprunt contraignent l’encapsulation.</strong> C’est la première
conséquence que j’ai réalisée.</p>
<p>Voici un exemple canonique :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Data</span> <span class="p">{</span>
<span class="n">header</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">4</span><span class="p">],</span>
<span class="n">payload</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">20</span><span class="p">],</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="n">Self</span> <span class="p">{</span>
<span class="n">Self</span> <span class="p">{</span>
<span class="n">header</span><span class="p">:</span> <span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">4</span><span class="p">],</span>
<span class="n">payload</span><span class="p">:</span> <span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">20</span><span class="p">],</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">header</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">]</span> <span class="p">{</span>
<span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.header</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">payload</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">]</span> <span class="p">{</span>
<span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.payload</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nn">Data</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">let</span> <span class="n">header</span> <span class="o">=</span> <span class="n">data</span><span class="nf">.header</span><span class="p">();</span>
<span class="k">let</span> <span class="n">payload</span> <span class="o">=</span> <span class="n">data</span><span class="nf">.payload</span><span class="p">();</span>
<span class="p">}</span></code></pre></figure>
<p>Nous créons juste une nouvelle instance de <code class="language-plaintext highlighter-rouge">Data</code>, puis associons à des
variables locales des références mutables vers les tableaux <code class="language-plaintext highlighter-rouge">header</code> et
<code class="language-plaintext highlighter-rouge">payload</code>, en passant par des accesseurs.</p>
<p>Cependant, cela ne compile pas :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc sample.rs
error[E0499]: cannot borrow `data` as mutable more than once at a time
--> sample.rs:21:19
|
25 | let header = data.header();
| ---- first mutable borrow occurs here
26 | let payload = data.payload();
| ^^^^ second mutable borrow occurs here
27 | }
| - first borrow ends here
</code></pre></div></div>
<p>Le compilateur ne peut pas faire l’hypothèse que <code class="language-plaintext highlighter-rouge">header()</code> et <code class="language-plaintext highlighter-rouge">payload()</code>
retournent des références vers des données disjointes dans la structure <code class="language-plaintext highlighter-rouge">Data</code>.
Par conséquent, chacun <em>emprunte</em> la structure <code class="language-plaintext highlighter-rouge">data</code> entièrement. Vu que les
règles d’emprunt interdisent d’obtenir deux références mutables vers la même
ressource, il rejette le second appel.</p>
<p>Parfois, nous faisons face à des limitations temporaires parce que le
compilateur n’est pas (encore) assez malin. Ce n’est pas le cas ici :
l’implémentation de <code class="language-plaintext highlighter-rouge">header()</code> pourrait très bien retourner une référence vers
<code class="language-plaintext highlighter-rouge">payload</code>, ou écrire dans le tableau <code class="language-plaintext highlighter-rouge">payload</code>, enfreignant ainsi les règles
d’emprunt. Et la validité d’un appel d’une méthode ne peut pas dépendre de
l’implementation de la méthode.</p>
<p>Pour corriger le problème, le compilateur doit être capable de savoir que les
variables locales <code class="language-plaintext highlighter-rouge">header</code> et <code class="language-plaintext highlighter-rouge">payload</code> référencent des <strong>données disjointes</strong>,
par exemple en accédant aux champs directement :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">let</span> <span class="n">header</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="n">data</span><span class="py">.header</span><span class="p">;</span>
<span class="k">let</span> <span class="n">payload</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="n">data</span><span class="py">.payload</span><span class="p">;</span></code></pre></figure>
<p>ou en exposant une méthode fournissant les deux références simultanément :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">header_and_payload</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="o">&</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">])</span> <span class="p">{</span>
<span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.header</span><span class="p">,</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.payload</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nn">Data</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">let</span> <span class="p">(</span><span class="n">header</span><span class="p">,</span> <span class="n">payload</span><span class="p">)</span> <span class="o">=</span> <span class="n">data</span><span class="nf">.header_and_payload</span><span class="p">();</span>
<span class="p">}</span></code></pre></figure>
<p>De même, dans l’implémentation d’une structure, les règles d’emprunt empêchent
de factoriser du code dans une méthode privée facilement. Prenons cet exemple
(artificiel) :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Data</span> <span class="p">{</span>
<span class="n">buf</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">20</span><span class="p">],</span>
<span class="n">prefix_length</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span>
<span class="n">sum</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">port</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_sum</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="o">&</span><span class="k">self</span><span class="py">.buf</span><span class="p">[</span><span class="k">self</span><span class="py">.prefix_length</span><span class="o">..</span><span class="p">];</span>
<span class="k">self</span><span class="py">.sum</span> <span class="o">=</span> <span class="n">content</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.cloned</span><span class="p">()</span><span class="nf">.map</span><span class="p">(</span><span class="nn">u32</span><span class="p">::</span><span class="n">from</span><span class="p">)</span><span class="nf">.sum</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_port</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="o">&</span><span class="k">self</span><span class="py">.buf</span><span class="p">[</span><span class="k">self</span><span class="py">.prefix_length</span><span class="o">..</span><span class="p">];</span>
<span class="k">self</span><span class="py">.port</span> <span class="o">=</span> <span class="p">(</span><span class="n">content</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">)</span> <span class="o"><<</span> <span class="mi">8</span> <span class="p">|</span> <span class="n">content</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Ici, le champ <code class="language-plaintext highlighter-rouge">buf</code> est un tableau stockant un préfixe et un contenu de manière
contiguë.</p>
<p>Nous voulons factoriser la manière dont nous récupérons la <em>slice</em> <code class="language-plaintext highlighter-rouge">content</code>,
pour que les méthodes <code class="language-plaintext highlighter-rouge">update_*()</code> n’aient pas à se préoccuper des détails.
Essayons :</p>
<figure class="highlight"><pre><code class="language-diff" data-lang="diff"> impl Data {
pub fn update_sum(&mut self) {
<span class="gd">- let content = &self.buf[self.prefix_length..];
</span><span class="gi">+ let content = self.content();
</span> self.sum = content.iter().cloned().map(u32::from).sum();
}
pub fn update_port(&mut self) {
<span class="gd">- let content = &self.buf[self.prefix_length..];
</span><span class="gi">+ let content = self.content();
</span> self.port = (content[2] as u16) << 8 | content[3] as u16;
}
<span class="gi">+
+ fn content(&mut self) -> &[u8] {
+ &self.buf[self.prefix_length..]
+ }
</span> }</code></pre></figure>
<p>Malheureusement, cela ne compile pas :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0506]: cannot assign to `self.sum` because it is borrowed
--> facto2.rs:11:9
|
10 | let content = self.content();
| ---- borrow of `self.sum` occurs here
11 | self.sum = content.iter().cloned().map(u32::from).sum();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ assignment to borrowed `self.sum` occurs here
error[E0506]: cannot assign to `self.port` because it is borrowed
--> facto2.rs:16:9
|
15 | let content = self.content();
| ---- borrow of `self.port` occurs here
16 | self.port = (content[2] as u16) << 8 & content[3] as u16;
|
</code></pre></div></div>
<p>Comme dans l’exemple précédent, récupérer une référence à travers une méthode
<em>emprunte</em> la structure complète (ici, <code class="language-plaintext highlighter-rouge">self</code>).</p>
<p>Pour contourner le problème, nous pouvons expliquer au compilateur que les
champs sont disjoints :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_sum</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nn">Self</span><span class="p">::</span><span class="nf">content</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="py">.buf</span><span class="p">,</span> <span class="k">self</span><span class="py">.prefix_length</span><span class="p">);</span>
<span class="k">self</span><span class="py">.sum</span> <span class="o">=</span> <span class="n">content</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.cloned</span><span class="p">()</span><span class="nf">.map</span><span class="p">(</span><span class="nn">u32</span><span class="p">::</span><span class="n">from</span><span class="p">)</span><span class="nf">.sum</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_port</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nn">Self</span><span class="p">::</span><span class="nf">content</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="py">.buf</span><span class="p">,</span> <span class="k">self</span><span class="py">.prefix_length</span><span class="p">);</span>
<span class="k">self</span><span class="py">.port</span> <span class="o">=</span> <span class="p">(</span><span class="n">content</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">)</span> <span class="o"><<</span> <span class="mi">8</span> <span class="p">|</span> <span class="n">content</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">content</span><span class="p">(</span><span class="n">buf</span><span class="p">:</span> <span class="o">&</span><span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="n">prefix_length</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span> <span class="p">{</span>
<span class="o">&</span><span class="n">buf</span><span class="p">[</span><span class="n">prefix_length</span><span class="o">..</span><span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Ça compile, mais cela va totalement à l’encontre de la factorisation :
l’appelant doit fournir les champs nécessaires.</p>
<p>Comme alternative, nous pouvons utiliser une <a href="https://doc.rust-lang.org/book/first-edition/macros.html">macro</a> pour <em>inliner</em> le
code :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">macro_rules!</span> <span class="n">content</span> <span class="p">{</span>
<span class="p">(</span><span class="nv">$self:ident</span><span class="p">)</span> <span class="k">=></span> <span class="p">{</span>
<span class="o">&</span><span class="nv">$self</span><span class="py">.buf</span><span class="p">[</span><span class="nv">$self</span><span class="py">.prefix_length</span><span class="o">..</span><span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Data</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_sum</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nd">content!</span><span class="p">(</span><span class="k">self</span><span class="p">);</span>
<span class="k">self</span><span class="py">.sum</span> <span class="o">=</span> <span class="n">content</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.cloned</span><span class="p">()</span><span class="nf">.map</span><span class="p">(</span><span class="nn">u32</span><span class="p">::</span><span class="n">from</span><span class="p">)</span><span class="nf">.sum</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">update_port</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nd">content!</span><span class="p">(</span><span class="k">self</span><span class="p">);</span>
<span class="k">self</span><span class="py">.port</span> <span class="o">=</span> <span class="p">(</span><span class="n">content</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">)</span> <span class="o"><<</span> <span class="mi">8</span> <span class="p">|</span> <span class="n">content</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u16</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Mais c’est loin d’être idéal.</p>
<p>Je pense que nous devons juste l’accepter : l’encapsulation entre parfois en
conflit avec les règles d’emprunt. Après tout, ce n’est pas si surprenant :
imposer les règles d’emprunt nécessite de suivre chaque accès concret aux
ressources, alors que l’encapsulation vise à les abstraire.</p>
<h3 id="observateur">Observateur</h3>
<p>Le <em>design pattern</em> <a href="https://en.wikipedia.org/wiki/Observer_pattern">observateur</a> est utile pour enregistrer des
événements sur un objet.</p>
<p>Dans certains cas, <strong>ce pattern pose des difficultés d’implémentation en Rust</strong>.</p>
<p>Pour faire simple, considérons que les événements sont des valeurs <code class="language-plaintext highlighter-rouge">u32</code>. Voici
une implémentation possible :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">trait</span> <span class="n">EventListener</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">on_event</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">event</span><span class="p">:</span> <span class="nb">u32</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Notifier</span> <span class="p">{</span>
<span class="n">listeners</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">Box</span><span class="o"><</span><span class="n">EventListener</span><span class="o">>></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Notifier</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="n">Self</span> <span class="p">{</span>
<span class="n">Self</span> <span class="p">{</span> <span class="n">listeners</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span> <span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="n">register</span><span class="o"><</span><span class="n">T</span><span class="p">:</span> <span class="n">EventListener</span> <span class="o">+</span> <span class="nv">'static</span><span class="o">></span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">listener</span><span class="p">:</span> <span class="n">T</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.listeners</span><span class="nf">.push</span><span class="p">(</span><span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">listener</span><span class="p">));</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">notify</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">event</span><span class="p">:</span> <span class="nb">u32</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">listener</span> <span class="n">in</span> <span class="o">&</span><span class="k">self</span><span class="py">.listeners</span> <span class="p">{</span>
<span class="n">listener</span><span class="nf">.on_event</span><span class="p">(</span><span class="n">event</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Par commodité, implémentons notre trait <code class="language-plaintext highlighter-rouge">EventListener</code> pour les closures :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="n">F</span><span class="p">:</span> <span class="nf">Fn</span><span class="p">(</span><span class="nb">u32</span><span class="p">)</span><span class="o">></span> <span class="n">EventListener</span> <span class="k">for</span> <span class="n">F</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">on_event</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">event</span><span class="p">:</span> <span class="nb">u32</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="p">(</span><span class="n">event</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Ainsi, son utilisation est simple :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">let</span> <span class="k">mut</span> <span class="n">notifier</span> <span class="o">=</span> <span class="nn">Notifier</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="n">notifier</span><span class="nf">.register</span><span class="p">(|</span><span class="n">event</span><span class="p">|</span> <span class="nd">println!</span><span class="p">(</span><span class="s">"received [{}]"</span><span class="p">,</span> <span class="n">event</span><span class="p">));</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"notifying..."</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span></code></pre></figure>
<p>Cela affiche :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>notifying...
received [42]
</code></pre></div></div>
<p>Jusqu’ici, tout va bien.</p>
<p>Cependant, les choses se compliquent si nous voulons modifier un état sur la
réception d’un événement. Par exemple, implémentons une structure pour stocker
tous les événements reçus :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="n">events</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">u32</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="n">Self</span> <span class="p">{</span>
<span class="n">Self</span> <span class="p">{</span> <span class="n">events</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span> <span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">store</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">value</span><span class="p">:</span> <span class="nb">u32</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.events</span><span class="nf">.push</span><span class="p">(</span><span class="n">value</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">events</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="nb">Vec</span><span class="o"><</span><span class="nb">u32</span><span class="o">></span> <span class="p">{</span>
<span class="o">&</span><span class="k">self</span><span class="py">.events</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Pour pouvoir remplir ce <code class="language-plaintext highlighter-rouge">Storage</code> sur chaque événement reçu, nous devons d’une
manière ou d’une autre le passer avec l’<em>event listener</em>, qui sera stocké dans
le <code class="language-plaintext highlighter-rouge">Notifier</code>. Par conséquent, nous avons besoin qu’une instance de <code class="language-plaintext highlighter-rouge">Storage</code>
soit <strong>partagée</strong> entre le code appelant et le <code class="language-plaintext highlighter-rouge">Notifier</code>.</p>
<p>Avoir deux références mutables vers le même objet enfreint évidemment les règles
d’emprunt, donc nous avons besoin d’un <a href="https://doc.rust-lang.org/std/rc/">pointeur à compteur de références</a>.</p>
<p>Cependant, un tel pointeur est en lecture seul, donc nous avons également besoin
d’un <a href="https://doc.rust-lang.org/std/cell/index.html"><code class="language-plaintext highlighter-rouge">RefCell</code></a> pour la <a href="https://ricardomartins.cc/2016/06/08/interior-mutability"><em>mutabilité intérieure</em></a>.</p>
<p>Ainsi, nous allons utiliser une instance de <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code>. Cela peut
sembler trop verbeux, mais utiliser <code class="language-plaintext highlighter-rouge">Rc<RefCell<T>></code> (ou <code class="language-plaintext highlighter-rouge">Arc<Mutex<T>></code> pour
la <em>thread-safety</em>) est très courant en Rust. Et <a href="https://www.reddit.com/r/rust/comments/33jv62/vecrcrefcellboxtrait_is_there_a_better_way/">il y a pire</a>.</p>
<p>Voici ce que donne le code client :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">cell</span><span class="p">::</span><span class="n">RefCell</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">rc</span><span class="p">::</span><span class="nb">Rc</span><span class="p">;</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">notifier</span> <span class="o">=</span> <span class="nn">Notifier</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="c">// first Rc to the Storage</span>
<span class="k">let</span> <span class="n">rc</span> <span class="o">=</span> <span class="nn">Rc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">RefCell</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Storage</span><span class="p">::</span><span class="nf">new</span><span class="p">()));</span>
<span class="c">// second Rc to the Storage</span>
<span class="k">let</span> <span class="n">rc2</span> <span class="o">=</span> <span class="n">rc</span><span class="nf">.clone</span><span class="p">();</span>
<span class="c">// register the listener saving all the received events to the Storage</span>
<span class="n">notifier</span><span class="nf">.register</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">event</span><span class="p">|</span> <span class="n">rc2</span><span class="nf">.borrow_mut</span><span class="p">()</span><span class="nf">.store</span><span class="p">(</span><span class="n">event</span><span class="p">));</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">141</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">59</span><span class="p">);</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="o">&</span><span class="nd">vec!</span><span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">141</span><span class="p">,</span> <span class="mi">59</span><span class="p">],</span> <span class="n">rc</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.events</span><span class="p">());</span></code></pre></figure>
<p>De cette manière, le <code class="language-plaintext highlighter-rouge">Storage</code> est correctement modifié à partir de l’<em>event
listener</em>.</p>
<p>Tout n’est pas résolu pour autant. Dans cet exemple, c’était facile, nous avions
accès à l’instance <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code>. Comment faire si nous avons seulement
accès au <code class="language-plaintext highlighter-rouge">Storage</code>, par exemple si nous voulons que le <code class="language-plaintext highlighter-rouge">Storage</code> s’enregistre
lui-même à partir de l’une de ses méthodes, sans que l’appelant n’ait à fournir
l’instance <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code> ?</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">register_to</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">notifier</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">Notifier</span><span class="p">)</span> <span class="p">{</span>
<span class="n">notifier</span><span class="nf">.register</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">event</span><span class="p">|</span> <span class="p">{</span>
<span class="cm">/* how to retrieve a &mut Storage from here? */</span>
<span class="p">});</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Nous devons trouver un moyen de récupérer le <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code> à partir du
<code class="language-plaintext highlighter-rouge">Storage</code>.</p>
<p>Pour cela, l’idée consiste à rendre <code class="language-plaintext highlighter-rouge">Storage</code> conscient de son pointeur à
compteur de références. <em>Bien sûr, cela n’a du sens que si <code class="language-plaintext highlighter-rouge">Storage</code> est
construit dans un <code class="language-plaintext highlighter-rouge">Rc<RefCell<Storage>></code>.</em></p>
<p>C’est exactement ce que <a href="http://en.cppreference.com/w/cpp/memory/enable_shared_from_this"><code class="language-plaintext highlighter-rouge">enable_shared_from_this</code></a> fournit en C++, donc nous
pouvons nous inspirer de <a href="https://stackoverflow.com/a/34062114/1987178">son fonctionnement</a> : juste
stocker un <code class="language-plaintext highlighter-rouge">Weak<RefCell<…>></code>, <a href="https://doc.rust-lang.org/std/rc/struct.Rc.html#method.downgrade"><em>downgradé</em></a> à partir du
<code class="language-plaintext highlighter-rouge">Rc<RefCell<…>></code>, dans la structure elle-même. De cette manière, nous pouvons
l’utiliser pour récupérer une référence <code class="language-plaintext highlighter-rouge">&mut Storage</code> à partir de l’<em>event
listener</em> :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">rc</span><span class="p">::{</span><span class="nb">Rc</span><span class="p">,</span> <span class="n">Weak</span><span class="p">};</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">cell</span><span class="p">::</span><span class="n">RefCell</span><span class="p">;</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="n">self_weak</span><span class="p">:</span> <span class="n">Weak</span><span class="o"><</span><span class="n">RefCell</span><span class="o"><</span><span class="n">Storage</span><span class="o">>></span><span class="p">,</span>
<span class="n">events</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">u32</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Storage</span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="nb">Rc</span><span class="o"><</span><span class="n">RefCell</span><span class="o"><</span><span class="n">Self</span><span class="o">>></span> <span class="p">{</span>
<span class="k">let</span> <span class="n">rc</span> <span class="o">=</span> <span class="nn">Rc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">RefCell</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">Self</span> <span class="p">{</span>
<span class="n">self_weak</span><span class="p">:</span> <span class="nn">Weak</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span> <span class="c">// initialize empty</span>
<span class="n">events</span><span class="p">:</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span>
<span class="p">}));</span>
<span class="c">// set self_weak once we get the Rc instance</span>
<span class="n">rc</span><span class="nf">.borrow_mut</span><span class="p">()</span><span class="py">.self_weak</span> <span class="o">=</span> <span class="nn">Rc</span><span class="p">::</span><span class="nf">downgrade</span><span class="p">(</span><span class="o">&</span><span class="n">rc</span><span class="p">);</span>
<span class="n">rc</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">register_to</span><span class="p">(</span><span class="o">&</span><span class="k">self</span><span class="p">,</span> <span class="n">notifier</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="n">Notifier</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">rc</span> <span class="o">=</span> <span class="k">self</span><span class="py">.self_weak</span><span class="nf">.upgrade</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">();</span>
<span class="n">notifier</span><span class="nf">.register</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">event</span><span class="p">|</span> <span class="n">rc</span><span class="nf">.borrow_mut</span><span class="p">()</span><span class="nf">.store</span><span class="p">(</span><span class="n">event</span><span class="p">))</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Voici comment l’utiliser :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">let</span> <span class="k">mut</span> <span class="n">notifier</span> <span class="o">=</span> <span class="nn">Notifier</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">let</span> <span class="n">rc</span> <span class="o">=</span> <span class="nn">Storage</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="n">rc</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.register_to</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">notifier</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">141</span><span class="p">);</span>
<span class="n">notifier</span><span class="nf">.notify</span><span class="p">(</span><span class="mi">59</span><span class="p">);</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="o">&</span><span class="nd">vec!</span><span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">141</span><span class="p">,</span> <span class="mi">59</span><span class="p">],</span> <span class="n">rc</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.events</span><span class="p">());</span></code></pre></figure>
<p>Il est donc possible d’implémenter le design pattern <em>observateur</em> en Rust, mais
c’est un peu plus difficile qu’en Java ;-)</p>
<p>Lorsque c’est possible, il est probablement préférable de l’éviter.</p>
<h3 id="partage-de-données-mutables">Partage de données mutables</h3>
<blockquote>
<p>Mutable references cannot be <a href="https://doc.rust-lang.org/nomicon/references.html">aliased</a>.</p>
</blockquote>
<p>En français :</p>
<blockquote>
<p>Les références mutables ne peuvent pas être <a href="https://doc.rust-lang.org/nomicon/references.html">aliasées</a>.</p>
</blockquote>
<p>Comment partager des données mutables, alors ?</p>
<p>Nous avons vu que nous pouvions utiliser <code class="language-plaintext highlighter-rouge">Rc<RefCell<…>></code> (ou <code class="language-plaintext highlighter-rouge">Arc<Mutex<…>></code>),
qui impose les règles d’emprunt à l’exécution. Cependant, ce n’est pas toujours
désirable :</p>
<ul>
<li>cela force une nouvelle allocation sur le tas,</li>
<li>chaque accès a un coût à l’exécution,</li>
<li>l’emprunt concerne toujours la ressource entière.</li>
</ul>
<p>Au lieu de cela, nous pourrions utiliser des pointeurs <em>bruts</em> manuellement dans
du code <a href="https://doc.rust-lang.org/book/first-edition/unsafe.html">non-sûr</a>, mais alors ce serait <em>non-sûr</em>.</p>
<p>Et il y a une autre solution, qui consiste à exposer des <strong>vues temporaires
d’emprunt</strong> d’un objet. Laissez-moi expliquer.</p>
<p>Dans <em>Gnirehtet</em>, un paquet contient une référence vers les données brutes
(stockées dans un buffer quelque part) ainsi que les valeur des champs des
en-têtes <a href="https://en.wikipedia.org/wiki/IPv4#Packet_structure">IP</a> et <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_segment_structure">TCP</a>/<a href="https://en.wikipedia.org/wiki/User_Datagram_Protocol#Packet_structure">UDP</a> (parsées à partir du tableau d’octets brut). Nous
aurions pu utiliser une structure à plat pour tout stocker :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">ipv4_source</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">ipv4_destination</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">ipv4_protocol</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="c">// + other ipv4 fields</span>
<span class="n">transport_source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="n">transport_destination</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="c">// + other transport fields</span>
<span class="p">}</span></code></pre></figure>
<p>Le <code class="language-plaintext highlighter-rouge">Packet</code> aurait fourni des <em>setters</em> pour tous les champs d’en-têtes
(modifiant à la fois les champs du paquet et le tableau d’octets). Par exemple :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">set_transport_source</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">transport_source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.transport_source</span> <span class="o">=</span> <span class="n">transport_source</span><span class="p">;</span>
<span class="k">let</span> <span class="n">transport</span> <span class="o">=</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.raw</span><span class="p">[</span><span class="mi">20</span><span class="o">..</span><span class="p">];</span>
<span class="nn">BigEndian</span><span class="p">::</span><span class="nf">write_u16</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">transport</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">2</span><span class="p">],</span> <span class="n">port</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Mais cette conception ne serait pas terrible (surtout que les champs d’en-têtes
TCP et UDP sont différents).</p>
<p>À la place, nous voudrions extraire les en-têtes d’IP et de transport vers des
structures séparées, gérant leur propre partie du tableau d’octets :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="c">// violates the borrowing rules</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="c">// the whole packet (including headers)</span>
<span class="n">ipv4_header</span><span class="p">:</span> <span class="n">Ipv4Header</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">,</span>
<span class="n">transport_header</span><span class="p">:</span> <span class="n">TransportHeader</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Ipv4Header</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="c">// slice related to ipv4 headers</span>
<span class="n">source</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">destination</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">protocol</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="c">// + other ipv4 fields</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">TransportHeader</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span> <span class="c">// slice related to transport headers</span>
<span class="n">source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="n">destination</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="c">// + other transport fields</span>
<span class="p">}</span></code></pre></figure>
<p>Vous avez immédiatement repéré le problème : <strong>il y a plusieurs références vers
la même ressource, le tableau d’octets <code class="language-plaintext highlighter-rouge">raw</code>, en même temps</strong>.</p>
<p><em>Remarquez que <a href="https://doc.rust-lang.org/std/primitive.slice.html#method.split_at_mut">diviser</a> le tableau n’est pas une possibilité ici, vu
que les slices de <code class="language-plaintext highlighter-rouge">raw</code> se chevauchent : nous avons besoin d’écrire le paquet
complet en une seule fois vers la couche réseau, donc le tableau <code class="language-plaintext highlighter-rouge">raw</code> dans
<code class="language-plaintext highlighter-rouge">Packet</code> doit inclure les headers.</em></p>
<p>Nous avons besoin d’une solution compatible avec les règles d’emprunt.</p>
<p>Voici celle à laquelle je suis parvenu :</p>
<ul>
<li>stocker les données des en-têtes séparément, sans les <em>slices</em> de <code class="language-plaintext highlighter-rouge">raw</code>,</li>
<li>créer des structures de <em>vues</em> pour les en-têtes d’IP et de transport, liées
à une <a href="https://doc.rust-lang.org/book/first-edition/lifetimes.html#in-structs">durée de vie</a>,</li>
<li>exposer des méthodes de <code class="language-plaintext highlighter-rouge">Packet</code> retournant des instances de <em>vues</em>.</li>
</ul>
<p>Et voici une simplification de l’implémentation réelle :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">pub</span> <span class="k">struct</span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">ipv4_header</span><span class="p">:</span> <span class="n">Ipv4HeaderData</span><span class="p">,</span>
<span class="n">transport_header</span><span class="p">:</span> <span class="n">TransportHeaderData</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Ipv4HeaderData</span> <span class="p">{</span>
<span class="n">source</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">destination</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="n">protocol</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
<span class="c">// + other ipv4 fields</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">TransportHeaderData</span> <span class="p">{</span>
<span class="n">source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="n">destination</span><span class="p">:</span> <span class="nb">u16</span><span class="p">,</span>
<span class="c">// + other transport fields</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Ipv4Header</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="n">Ipv4HeaderData</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">TransportHeader</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">],</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="nv">'a</span> <span class="k">mut</span> <span class="n">TransportHeaderData</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="n">Packet</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">ipv4_header</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">Ipv4Header</span> <span class="p">{</span>
<span class="n">Ipv4Header</span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.raw</span><span class="p">[</span><span class="o">..</span><span class="mi">20</span><span class="p">],</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.ipv4_header</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">transport_header</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="k">-></span> <span class="n">TransportHeader</span> <span class="p">{</span>
<span class="n">TransportHeader</span> <span class="p">{</span>
<span class="n">raw</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.raw</span><span class="p">[</span><span class="mi">20</span><span class="o">..</span><span class="mi">40</span><span class="p">],</span>
<span class="n">data</span><span class="p">:</span> <span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="py">.transport_header</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Les <em>setters</em> sont implémentés sur les vues, où ils détiennent une référence
mutable vers le tableau brut :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="n">TransportHeader</span><span class="o"><</span><span class="nv">'a</span><span class="o">></span> <span class="p">{</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">set_source</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">source</span><span class="p">:</span> <span class="nb">u16</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.data.source</span> <span class="o">=</span> <span class="n">source</span><span class="p">;</span>
<span class="nn">BigEndian</span><span class="p">::</span><span class="nf">write_u16</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">raw</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">2</span><span class="p">],</span> <span class="n">source</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">set_destination</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">destination</span><span class="p">:</span> <span class="nb">u16</span><span class="p">)</span> <span class="p">{</span>
<span class="k">self</span><span class="py">.data.destination</span> <span class="o">=</span> <span class="n">destination</span><span class="p">;</span>
<span class="nn">BigEndian</span><span class="p">::</span><span class="nf">write_u16</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">raw</span><span class="p">[</span><span class="mi">2</span><span class="o">..</span><span class="mi">4</span><span class="p">],</span> <span class="n">destination</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>De cette manière, les règles d’emprunt sont respectées, et l’API est élégante :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"> <span class="k">let</span> <span class="k">mut</span> <span class="n">packet</span> <span class="o">=</span> <span class="err">…</span><span class="p">;</span>
<span class="c">// "transport_header" borrows "packet" during its scope</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">transport_header</span> <span class="o">=</span> <span class="n">packet</span><span class="nf">.transport_header</span><span class="p">();</span>
<span class="n">transport_header</span><span class="nf">.set_source</span><span class="p">(</span><span class="mi">1234</span><span class="p">);</span>
<span class="n">transport_header</span><span class="nf">.set_destination</span><span class="p">(</span><span class="mi">1234</span><span class="p">);</span></code></pre></figure>
<h3 id="limitations-du-compilateur">Limitations du compilateur</h3>
<p>Rust est un langage jeune, et le compilateur a quelques problèmes ennuyeux.</p>
<p>Le pire, d’après moi, est lié aux <a href="http://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/#problem-case-2-conditional-control-flow">durées de vie non-lexicales</a>, qui provoque des <a href="https://stackoverflow.com/questions/44417491/non-lexical-lifetime-workaround-failure">erreurs inattendues</a> :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">Container</span> <span class="p">{</span>
<span class="n">vec</span><span class="p">:</span> <span class="nb">Vec</span><span class="o"><</span><span class="nb">i32</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Container</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">find</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">v</span><span class="p">:</span> <span class="nb">i32</span><span class="p">)</span> <span class="k">-></span> <span class="nb">Option</span><span class="o"><&</span><span class="k">mut</span> <span class="nb">i32</span><span class="o">></span> <span class="p">{</span>
<span class="nb">None</span> <span class="c">// we don't care the implementation</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">get</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">v</span><span class="p">:</span> <span class="nb">i32</span><span class="p">)</span> <span class="k">-></span> <span class="o">&</span><span class="k">mut</span> <span class="nb">i32</span> <span class="p">{</span>
<span class="k">if</span> <span class="k">let</span> <span class="nf">Some</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.find</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">x</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">self</span><span class="py">.vec</span><span class="nf">.push</span><span class="p">(</span><span class="n">v</span><span class="p">);</span>
<span class="k">self</span><span class="py">.vec</span><span class="nf">.last_mut</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0499]: cannot borrow `self.vec` as mutable more than once at a time
--> sample.rs:14:9
|
11 | if let Some(x) = self.find(v) {
| ---- first mutable borrow occurs here
...
14 | self.vec.push(v);
| ^^^^^^^^ second mutable borrow occurs here
15 | self.vec.last_mut().unwrap()
16 | }
| - first borrow ends here
</code></pre></div></div>
<p>Heureusement, <a href="http://smallcultfollowing.com/babysteps/blog/2017/07/11/non-lexical-lifetimes-draft-rfc-and-prototype-available/">cela devrait être corrigé prochainement</a>.</p>
<p>La fonctionnalité d’<a href="https://github.com/rust-lang/rfcs/blob/master/text/1522-conservative-impl-trait.md"><em>Impl Trait</em></a>, permettant aux fonctions de
retourner des types abstraits <em>non-boxés</em>, devrait aussi améliorer l’expérience
(il y a aussi une proposition <a href="https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md">étendue</a>).</p>
<p>Le compilateur produit généralement des messages d’erreur très utiles. Mais
quand ce n’est pas le cas, ils peuvent être très <a href="https://stackoverflow.com/questions/44003622/implementing-trait-for-fnsomething-in-rust">déroutants</a>.</p>
<h2 id="sûreté-et-pièges">Sûreté et pièges</h2>
<p>Le <a href="https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html">premier chapitre du <em>Rustonomicon</em></a> dit :</p>
<blockquote>
<p>Safe Rust is For Reals Totally Safe.</p>
<p>[…]</p>
<p>Safe Rust is the true Rust programming language. If all you do is write Safe
Rust, you will never have to worry about type-safety or memory-safety. You
will never endure a null or dangling pointer, or any of that Undefined
Behavior nonsense.</p>
</blockquote>
<p>En français :</p>
<blockquote>
<p>La partie Sûre de Rust est Réellement Totallement Sûre.</p>
<p>[…]</p>
<p>Le Rust Sûr est le vrai langage de programmation Rust. Si vous n’écrivez que
du Rust Sûr, vous n’aurez jamais à vous inquiétez de la sûreté des types ou de
la mémoire. Vous n’aurez jamais à supporter un pointeur null ou <a href="https://fr.wikipedia.org/wiki/Dangling_pointer"><em>dangling</em></a>,
ou l’un de ces <a href="/2014/10/comportement-indefini-et-optimisation/"><em>comportements indéfinis</em></a> insensés.</p>
</blockquote>
<p>C’est le but. Et c’est <em>presque</em> vrai.</p>
<h3 id="leakpocalypse">Leakpocalypse</h3>
<p>Dans le passé, il a été <a href="https://github.com/rust-lang/rust/issues/24292">possible</a> d’écrire du code <em>Rust sûr</em>
<strong>accédant à de la mémoire libérée</strong>.</p>
<p>Cette “<a href="http://cglab.ca/~abeinges/blah/everyone-poops/">leakpocalypse</a>” a conduit à la <a href="https://github.com/alexcrichton/rfcs/blob/safe-mem-forget/text/0000-safe-mem-forget.md">clarification</a> des guaranties
de sûreté : ne pas exécuter un destructeur est maintenant <a href="https://github.com/rust-lang/rfcs/pull/1066">considéré
<em>sûr</em></a>. En d’autres termes, <strong>la sûreté mémoire ne peut plus
reposer sur <a href="https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization">RAII</a></strong> (en fait, elle n’a jamais pu, mais cela n’a été remarqué
que tardivement).</p>
<p>En conséquence, <a href="https://doc.rust-lang.org/std/mem/fn.forget.html"><code class="language-plaintext highlighter-rouge">std::mem::forget</code></a> est maintenant <em>sûr</em>, et <a href="https://doc.rust-lang.org/1.0.0/std/thread/struct.JoinGuard.html"><code class="language-plaintext highlighter-rouge">JoinGuard</code></a> a
été déprécié et supprimé de la bibliothèque standard (il a été déplacé vers un
<a href="http://arcnmx.github.io/thread-scoped-rs/thread_scoped/">crate séparé</a>).</p>
<p>Les autres outils s’appuyant sur RAII (comme <a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.drain"><code class="language-plaintext highlighter-rouge">Vec::drain()</code></a>) doivent prendre
des <a href="https://github.com/rust-lang/rust/blob/1.20.0/src/liballoc/vec.rs#L1094-L1102">précautions particulières</a> pour garantir que la mémoire
ne sera pas corrompue.</p>
<p>Ouf, la <em>sûreté mémoire</em> est (maintenant) sauvée.</p>
<h3 id="infinité-indéfinie">Infinité indéfinie</h3>
<p>En C et C++, les <a href="https://stackoverflow.com/questions/3592557/optimizing-away-a-while1-in-c0x">boucles infinies</a> sans effets de bords sont un cas
d’<a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm"><em>undefined behavior</em></a>. À cause de cela, il est possible d’écrire des
programmes qui, de façon inattendue, <a href="https://blog.regehr.org/archives/140">réfutent le dernier théorème de
Fermat</a>.</p>
<p>En pratique, le compilateur Rust s’appuie sur LLVM, qui (actuellement) applique
ses optimisations en faisant l’hypothèse que les boucles infinies sans effets de
bords ont un <em>comportement indéfini</em>. En conséquence, de tels <em>undefined
behaviors</em> se produisent également en Rust.</p>
<p>Voici un exemple minimal pour l’observer :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">infinite</span><span class="p">()</span> <span class="p">{</span>
<span class="k">loop</span> <span class="p">{}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="nf">infinite</span><span class="p">();</span>
<span class="p">}</span></code></pre></figure>
<p>Quand on l’exécute sans optimisations, il se comporte comme “attendu” :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc ub.rs && ./ub
^C (infinite loop, interrupt it)
</code></pre></div></div>
<p>Mais activer les optimisations fait <em>paniquer</em> le programme :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc -O ub.rs && ./ub
thread 'main' panicked at 'assertion failed: c.borrow().is_none()', /checkout/src/libstd/sys_common/thread_info.rs:51
note: Run with `RUST_BACKTRACE=1` for a backtrace.
</code></pre></div></div>
<p>Nous pouvons aussi produire des résultats inattendus sans plantage :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="nf">infinite</span><span class="p">(</span><span class="k">mut</span> <span class="n">value</span><span class="p">:</span> <span class="nb">u32</span><span class="p">)</span> <span class="p">{</span>
<span class="c">// infinite loop unless value initially equals 0</span>
<span class="k">while</span> <span class="n">value</span> <span class="o">!=</span> <span class="mi">0</span> <span class="p">{</span>
<span class="k">if</span> <span class="n">value</span> <span class="o">!=</span> <span class="mi">1</span> <span class="p">{</span>
<span class="n">value</span> <span class="o">-=</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="nf">infinite</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"end"</span><span class="p">);</span>
<span class="p">}</span></code></pre></figure>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc ub.rs && ./ub
^C (infinite loop, interrupt it)
</code></pre></div></div>
<p>Mais avec optimisations :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rustc -O ub.rs && ./ub
end
</code></pre></div></div>
<p>C’est un cas particulier, qui sera probablement corrigé dans le futur. En
pratique, <strong>les garanties de sûreté de Rust sont très fortes</strong> (au prix d’être
contraignantes).</p>
<h3 id="erreur-de-segmentation">Erreur de segmentation</h3>
<p><em>Cette section a été ajoutée après la publication.</em></p>
<p>Il y a d’autres sources d’<em>undefined behaviors</em> (voir les <a href="https://github.com/rust-lang/rust/labels/I-unsound"><em>issues</em> taggées
<em>I-unsound</em></a>).</p>
<p>Par exemple, <em>caster</em> une valeur flottante ne pouvant pas être représentée dans
le type cible est un <em>undefined behavior</em>, qui peut être <a href="https://github.com/rust-lang/rust/issues/10184#issuecomment-139858153">propagé</a> pour
provoquer une erreur de segmentation :</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">#[inline(never)]</span>
<span class="k">pub</span> <span class="k">fn</span> <span class="nf">f</span><span class="p">(</span><span class="n">ary</span><span class="p">:</span> <span class="o">&</span><span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">5</span><span class="p">])</span> <span class="k">-></span> <span class="o">&</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">idx</span> <span class="o">=</span> <span class="mf">1e100f64</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">;</span>
<span class="o">&</span><span class="n">ary</span><span class="p">[</span><span class="n">idx</span><span class="o">..</span><span class="p">]</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"{}"</span><span class="p">,</span> <span class="nf">f</span><span class="p">(</span><span class="o">&</span><span class="p">[</span><span class="mi">1</span><span class="p">;</span> <span class="mi">5</span><span class="p">])[</span><span class="mi">0xdeadbeef</span><span class="p">]);</span>
<span class="p">}</span></code></pre></figure>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rustc -O ub.rs && ./ub
Erreur de segmentation
</code></pre></div></div>
<h2 id="stats">Stats</h2>
<p>C’est tout pour mes retours sur le langage lui-même.</p>
<p>En supplément, comparons les versions <em>Java</em> et <em>Rust</em> du serveur relais.</p>
<h3 id="nombre-de-lignes">Nombre de lignes</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cloc relay-{java,rust}/src
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Rust 29 687 655 4506
Java 37 726 701 2931
-------------------------------------------------------------------------------
</code></pre></div></div>
<p><em>(tests included)</em></p>
<p>Le projet Rust est significativement plus gros, pour plusieurs raisons :</p>
<ul>
<li>il y a beaucoup de classes de <a href="#partage-de-donnes-mutables"><em>vues d’emprunt</em></a> ;</li>
<li>la version Rust définit sa propre classe de <em>selecteur</em> d’I/O asynchrone,
encapsulant <a href="https://docs.rs/mio/0.6.10/mio/struct.Poll.html"><code class="language-plaintext highlighter-rouge">Poll</code></a> de plus bas niveau, alors que la version Java
utilise le <a href="https://docs.oracle.com/javase/8/docs/api/java/nio/channels/Selector.html"><code class="language-plaintext highlighter-rouge">Selector</code></a> standard ;</li>
<li>la <a href="https://doc.rust-lang.org/book/first-edition/error-handling.html">gestion d’erreur</a> pour l’analyse de la ligne de commande
est plus verbeuse.</li>
</ul>
<p>La version Java contient plus de fichiers car les tests unitaires sont séparés,
alors qu’en Rust ils se trouvent dans le même fichier que les classes qu’ils
testent.</p>
<p>Juste pour information, voici les résultats pour le client Android :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cloc app/src
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Java 15 198 321 875
XML 6 7 2 76
-------------------------------------------------------------------------------
SUM: 21 205 323 951
-------------------------------------------------------------------------------
</code></pre></div></div>
<h3 id="taille-des-binaires">Taille des binaires</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>--------------------------------------------
Java gnirehtet.jar 61K
--------------------------------------------
Rust gnirehtet 3.0M
after "strip -g gnirehtet" 747K
after "strip gnirehtet" 588K
--------------------------------------------
</code></pre></div></div>
<p>Le binaire Java lui-même est bien plus petit. La comparaison n’est pas juste
cependant, vu qu’il nécessite l’<em>environnement d’exécution Java</em> :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ du -sh /usr/lib/jvm/java-1.8.0-openjdk-amd64/
156M /usr/lib/jvm/java-1.8.0-openjdk-amd64/
</code></pre></div></div>
<h3 id="utilisation-mémoire">Utilisation mémoire</h3>
<p>Avec une seule connection TCP ouvert, voici la consommation mémoire pour le
serveur relais en Java :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo pmap -x $RELAY_JAVA_PID
Kbytes RSS Dirty
total kB 4364052 86148 69316
</code></pre></div></div>
<p><em>(résultat filtré)</em></p>
<p>Et pour le serveur relais en Rust :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo pmap -x $RELAY_RUST_PID
Kbytes RSS Dirty
total kB 19272 2736 640
</code></pre></div></div>
<p><em>Regardez la valeur <a href="https://en.wikipedia.org/wiki/Resident_set_size">RSS</a>, qui indique la mémoire réellement utilisée.</em></p>
<p>Comment on pouvait s’y attendre, la version Java consomme plus de mémoire
(86Mo) que la version Rust (moins de 3Mo). De plus, sa valeur est instable à
cause de l’allocation de petits objets et leur <a href="https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)"><em>garbage collection</em></a>, qui
génère aussi davantage de <em>dirty pages</em>. La valeur pour Rust, quant à elle, est
très stable : une fois la connection créée, il n’y a plus d’allocations mémoire
<em>du tout</em>.</p>
<h3 id="utilisation-cpu">Utilisation CPU</h3>
<p>Pour comparer l’utilisation CPU, voici mon scénario : un fichier de 500Mo est
hébergé par Apache sur mon ordinateur, je démarre le serveur relais avec <code class="language-plaintext highlighter-rouge">perf
stat</code>, puis je télécharge le fichier à partir de Firefox sur Android. Dès que le
fichier est téléchargé, je stoppe le serveur relais (Ctrl+C).</p>
<p>Voici les résultats pour la version Java :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ perf stat -B java -jar gnirehtet.jar relay
Performance counter stats for 'java -jar gnirehtet.jar relay':
11805,458302 task-clock:u (msec) # 0,088 CPUs utilized
0 context-switches:u # 0,000 K/sec
0 cpu-migrations:u # 0,000 K/sec
28 618 page-faults:u # 0,002 M/sec
17 908 360 446 cycles:u # 1,517 GHz
13 944 172 792 stalled-cycles-frontend:u # 77,86% frontend cycles idle
18 437 279 663 instructions:u # 1,03 insn per cycle
# 0,76 stalled cycles per insn
3 088 215 431 branches:u # 261,592 M/sec
70 647 760 branch-misses:u # 2,29% of all branches
133,975117164 seconds time elapsed
</code></pre></div></div>
<p>Et pour la version Rust :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ perf stat -B ./gnirehtet relay
Performance counter stats for 'target/release/gnirehtet relay':
2707,479968 task-clock:u (msec) # 0,020 CPUs utilized
0 context-switches:u # 0,000 K/sec
0 cpu-migrations:u # 0,000 K/sec
1 001 page-faults:u # 0,370 K/sec
1 011 527 340 cycles:u # 0,374 GHz
2 033 810 378 stalled-cycles-frontend:u # 201,06% frontend cycles idle
981 103 003 instructions:u # 0,97 insn per cycle
# 2,07 stalled cycles per insn
98 929 222 branches:u # 36,539 M/sec
3 220 527 branch-misses:u # 3,26% of all branches
133,766035253 seconds time elapsed
</code></pre></div></div>
<p>Je ne suis pas un expert pour analyser les résultats, mais de ce que je
comprends de la valeur <code class="language-plaintext highlighter-rouge">task-clock:u</code>, la version Rust consomme 4× moins de
temps CPU.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Réécrire <em>Gnirehtet</em> en Rust a été une formidable expérience, où j’ai appris un
super langage et de nouveaux concepts de programmation. Et maintenant, nous
avons une application native avec de meilleures performances.</p>
<p>Bon reverse tethering !</p>
<p><em>Discussions sur <a href="https://www.reddit.com/r/rust/comments/71ks57/gnirehtet_a_reverse_tethering_tool_for_android/">reddit</a> et <a href="https://news.ycombinator.com/item?id=15326106">Hacker News</a>.</em></p>
Fusionner deux dépôts git2017-07-12T20:30:00+02:00https://blog.rom1v.com/2017/07/fusionner-deux-depots-git<p>Ce billet explique comment fusionner un dépôt <em>git</em> (avec son historique) dans
un sous-répertoire d’un autre dépôt <em>git</em>.</p>
<h2 id="cas-dusage">Cas d’usage</h2>
<p>Mon projet principal se trouve dans un dépôt <code class="language-plaintext highlighter-rouge">main</code>. J’ai démarré dans un autre
dépôt un projet <code class="language-plaintext highlighter-rouge">other</code>, que je souhaite finalement intégrer dans un
sous-répertoire <code class="language-plaintext highlighter-rouge">sub/</code> du projet principal, en conservant son historique. Après
cette fusion, je ne garderai que le dépôt principal.</p>
<h2 id="fusion">Fusion</h2>
<p>Les deux projets se trouvent dans le répertoire courant :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ls
main other
</code></pre></div></div>
<p>Dans le dépôt <code class="language-plaintext highlighter-rouge">main</code>, <em>copier</em> la branche <code class="language-plaintext highlighter-rouge">master</code> de <code class="language-plaintext highlighter-rouge">other</code> dans une nouvelle
branche <code class="language-plaintext highlighter-rouge">tmp</code> :</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nb">cd </span>main
git fetch ../other master:tmp</code></pre></figure>
<p>Le dépôt <code class="language-plaintext highlighter-rouge">main</code> contient alors les historiques disjoints des deux projets.</p>
<p>Nous allons maintenant réécrire l’historique complet de la branche <code class="language-plaintext highlighter-rouge">tmp</code> pour
déplacer tout le contenu dans un sous-répertoire <code class="language-plaintext highlighter-rouge">sub/</code>, grâce une commande
donnée en exemple de <a href="https://git-scm.com/docs/git-filter-branch#_examples"><code class="language-plaintext highlighter-rouge">man git filter-branch</code></a> :</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">git checkout tmp
git filter-branch <span class="nt">--index-filter</span> <span class="se">\</span>
<span class="s1">'git ls-files -s | sed "s-\t\"*-&sub/-" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info &&
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"'</span></code></pre></figure>
<p>À ce stade, nous avons toujours deux historiques indépendants, mais le contenu
lié à la branche <code class="language-plaintext highlighter-rouge">tmp</code> se trouve dans le sous-répertoire <code class="language-plaintext highlighter-rouge">sub/</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>A---B---C---D master
X---Y---Z tmp
</code></pre></div></div>
<p>La dernière étape consiste à relier les deux historiques, soit grâce à un
<em>rebase</em>, soit grâce à un <em>merge</em>.</p>
<p>Un <em>rebase</em> réécrit l’historique du sous-projet sur la branche <code class="language-plaintext highlighter-rouge">master</code> :</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">git rebase master
<span class="c"># A---B---C---D---X'--Y'--Z' master</span></code></pre></figure>
<p>Un <em>merge</em> relie juste les deux historiques grâce à un commit de <em>merge</em> :</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">git merge tmp <span class="nt">--allow-unrelated-histories</span>
<span class="c"># A---B---C---D---M master</span>
<span class="c"># /</span>
<span class="c"># X---Y---Z tmp</span></code></pre></figure>
<h2 id="concrètement">Concrètement</h2>
<p>J’ai débuté la réécriture du serveur relais de <a href="/2017/03/gnirehtet/">gnirehtet</a> en <a href="https://www.rust-lang.org">Rust</a> dans un
dépôt séparé. Maintenant qu’il commence à fonctionner, je l’ai fusionné dans un
<a href="https://github.com/Genymobile/gnirehtet/tree/rust/rustrelay">sous-répertoire</a> du <a href="https://github.com/Genymobile/gnirehtet">dépôt principal</a> tout en conservant l’<a href="https://github.com/Genymobile/gnirehtet/commits/rust">historique</a> :</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">git fetch ../rustrelay master:tmp
git checkout tmp
git filter-branch <span class="nt">--index-filter</span> <span class="se">\</span>
<span class="s1">'git ls-files -s | sed "s-\t\"*-&rustrelay/-" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info &&
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"'</span>
git rebase master</code></pre></figure>
Introducing gnirehtet2017-03-30T12:00:00+02:00https://blog.rom1v.com/2017/03/introducing-gnirehtet<p>I spent the last few weeks at <a href="https://www.genymobile.com/">Genymobile</a> developing a tool providing <em>reverse
tethering</em> for Android, so that devices may use the internet connection of the
computer on which they are connected via USB, without requiring any <em>root</em>
access (neither on the device nor on the computer). It works on <em>GNU/Linux</em>,
<em>Windows</em> and <em>Mac OS</em>.</p>
<p>We decided to open source it under the name <a href="https://github.com/Genymobile/gnirehtet"><em>gnirehtet</em></a>.</p>
<p><em>Yeah, that’s a weird name, until you realize that this is the output of this
<a href="https://en.wikipedia.org/wiki/Bourne-Again_shell">bash</a> command:</em></p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">rev <span class="o"><<<</span> tethering</code></pre></figure>
<h2 id="how-to-use-gnirehtet">How to use Gnirehtet</h2>
<p>Basically, just download the latest <a href="https://github.com/Genymobile/gnirehtet/releases/latest">release</a>, extract it, and execute the
following command on the computer:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./gnirehtet rt
</code></pre></div></div>
<p>Once activated, a “key” logo appears in your device status bar:</p>
<p class="center"><img src="/assets/gnirehtet/key.png" alt="key" /></p>
<p>Check the <a href="https://github.com/Genymobile/gnirehtet/blob/master/README.md">README</a> file of the project for more details.</p>
<h2 id="how-does-gnirehtet-work">How does gnirehtet work?</h2>
<p>Gnirehtet is composed of two parts:</p>
<ul>
<li>an Android application (the client);</li>
<li>a Java desktop application (the relay server).</li>
</ul>
<p><em>Since then, <a href="/2017/09/gnirehtet-rewritten-in-rust/">I rewrote it in Rust</a>.</em></p>
<p>The client registers itself as a VPN, in order to intercept the whole device
network traffic, as <code class="language-plaintext highlighter-rouge">byte[]</code> of raw <a href="https://en.wikipedia.org/wiki/IPv4#Packet_structure">IPv4 packets</a>, which it transmits to the
relay server over a <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol">TCP</a> connection (established over <a href="https://developer.android.com/studio/command-line/adb.html"><em>adb</em></a>).</p>
<p>The relay server parses the packets headers, open connections from the computer
to the requested destinations, and relays the content in both directions
following the <a href="https://en.wikipedia.org/wiki/User_Datagram_Protocol">UDP</a> and <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol">TCP</a> protocols. It creates and sends response packets
back to the Android client, which writes them to the VPN interface.</p>
<p>In a sense, the relay server behaves like a <a href="https://en.wikipedia.org/wiki/Network_address_translation">NAT</a>, in that it opens connections
on behalf of private peers. However, it differs from standard NATs in the way it
communicates with the clients (the private peers), by using a very specific
(though simple) protocol over a TCP connection.</p>
<p class="center"><img src="/assets/gnirehtet/archi.png" alt="archi" /></p>
<p>For more details, you can read the <a href="https://github.com/Genymobile/gnirehtet/blob/master/DEVELOP.md">developers</a> page.</p>
<h2 id="here-are-the-solutions-i-have-considered">Here are the solutions I have considered</h2>
<p>Once the application is able to intercept the whole device network traffic,
several alternative designs are possible.</p>
<p><em><strong>TL;DR:</strong> I first considered creating a “TUN device” on the computer, but it
did not suit our needs. Then I wanted to benefit from existing <a href="https://en.wikipedia.org/wiki/SOCKS">SOCKS</a> servers,
but some constraints prevented us to relay UDP traffic. So I implemented
<a href="https://github.com/Genymobile/gnirehtet">gnirehtet</a>.</em></p>
<h3 id="tun-device">TUN device</h3>
<p>During my investigations on how to implement <em>reverse tethering</em>, I first found
projects creating a <a href="https://en.wikipedia.org/wiki/TUN/TAP">TUN device</a> on the computer (<a href="https://github.com/google/vpn-reverse-tether"><code class="language-plaintext highlighter-rouge">vpn-reverse-tether</code></a> and
<a href="https://github.com/vvviperrr/SimpleRT"><code class="language-plaintext highlighter-rouge">SimpleRT</code></a>).</p>
<p>This design works very well, and has several advantages:</p>
<ul>
<li>it operates at network level, so there is no need for translation between
level 3 and level 5 of the <a href="https://en.wikipedia.org/wiki/OSI_model">OSI model</a>;</li>
<li>all IP packets are tunneled, regardless of their transport protocol (so they
are <a href="https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers">all</a> supported, while <em>gnirehtet</em> “only” supports <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol">TCP</a> and
<a href="https://en.wikipedia.org/wiki/User_Datagram_Protocol">UDP</a>).</li>
</ul>
<p>However:</p>
<ul>
<li>it requires <em>root</em> access on the computer;</li>
<li>it does not work on platforms other than <em>Linux</em>.</li>
</ul>
<p><em>You could still consider using these “TUN device” applications, they may better
suit your needs.</em></p>
<h3 id="socks">SOCKS</h3>
<p>In order to avoid to develop a specific relay server, my first idea was to make
the client talk the <a href="https://en.wikipedia.org/wiki/SOCKS">SOCKS</a> protocol (according to <a href="https://tools.ietf.org/html/rfc1928">RFC 1928</a>). That way, it
would be possible to use any existing SOCKS server, for instance the one
provided by <code class="language-plaintext highlighter-rouge">ssh -D</code>.</p>
<p>You probably already used it to bypass annoying enterprise firewalls. For this
purpose, just start the tunnel:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh my_serveur -ND1080
</code></pre></div></div>
<p>Then configure your browser to use the SOCKS proxy <code class="language-plaintext highlighter-rouge">localhost:1080</code>. Also take
care to enable remote DNS resolution if you want to resolve domain names from
<code class="language-plaintext highlighter-rouge">my_server</code> (in <em>Firefox</em>, enable <code class="language-plaintext highlighter-rouge">network.proxy.socks_remote_dns</code> in
<code class="language-plaintext highlighter-rouge">about:config</code>).</p>
<p>Unfortunately, the <a href="https://en.wikipedia.org/wiki/OpenSSH">OpenSSH</a> implementation <a href="http://lists.mindrot.org/pipermail/openssh-unix-dev/2017-January/035662.html">does not support UDP</a>,
although the <a href="https://en.wikipedia.org/wiki/SOCKS#SOCKS5">SOCKS5</a> protocol itself does. And we do need UDP, at least for
<a href="https://en.wikipedia.org/wiki/Domain_Name_System">DNS</a> requests (and also <a href="https://en.wikipedia.org/wiki/Network_Time_Protocol">NTP</a>).</p>
<p>If you read carefully the two last paragraphs, you might want to ask yourself:</p>
<blockquote>
<p>How may Firefox resolve domain names remotely through the OpenSSH SOCKS proxy
if it does not even support UDP?</p>
</blockquote>
<p>The answer lies in the <a href="https://tools.ietf.org/html/rfc1928#section-4">section 4</a> of the RFC: the requested destination address
may be an IPv4, an IPv6 or <strong>a domain name</strong>. However, using this feature
implies that the client (e.g. <em>Firefox</em>) is aware of the proxy (since it must
explicitly pass the domain name instead of resolving it locally), while our
<em>reverse tethering</em> must be <strong>transparent</strong>.</p>
<p>But all is not lost. OK, <em>OpenSSH</em> does not support UDP, but this is just a
specific implementation, we could consider another one. Unfortunately, <a href="http://stackoverflow.com/questions/41967217/why-does-socks5-require-to-relay-udp-over-udp">SOCKS5
requires to relay UDP over UDP</a>, but the devices and the computer
communicate over <em>adb</em> (thanks to <code class="language-plaintext highlighter-rouge">adb reverse</code>), which does not support UDP
port forwarding either.</p>
<p>Maybe we could at least relay DNS requests by forcing them to <a href="https://tools.ietf.org/html/rfc7766">use TCP</a>, like <a href="https://linux.die.net/man/8/tsocks">tsocks</a> does:</p>
<blockquote>
<p><strong>tsocks</strong> will normally not be able to send DNS queries through a SOCKS
server since SOCKS V4 works on TCP and DNS normally uses UDP. Version 1.5 and
up do however provide a method to force DNS lookups to use TCP, which then
makes them proxyable.</p>
</blockquote>
<p>But then, SOCKS was no longer attractive to me for implementing <em>reverse
tethering</em>.</p>
<h3 id="gnirehtet">Gnirehtet</h3>
<p>Therefore, I developed both the client and the relay server manually.</p>
<p>This <a href="http://www.thegeekstuff.com/2014/06/android-vpn-service/">blog post</a> and several open source projects (<a href="https://github.com/vvviperrr/SimpleRT"><code class="language-plaintext highlighter-rouge">SimpleRT</code></a>,
<a href="https://github.com/google/vpn-reverse-tether"><code class="language-plaintext highlighter-rouge">vpn-reverse-tether</code></a>, <a href="https://github.com/hexene/LocalVPN"><code class="language-plaintext highlighter-rouge">LocalVPN</code></a> et <a href="https://android.googlesource.com/platform/development/+/master/samples/ToyVpn/"><code class="language-plaintext highlighter-rouge">ToyVpn</code></a>) helped me a lot to
understand how to implement this solution.</p>
<h2 id="conclusion">Conclusion</h2>
<p><a href="https://github.com/Genymobile/gnirehtet"><em>Gnirehtet</em></a> allows Android devices to use the internet connection
from a computer easily, without any <em>root</em> access. It helps when you can’t
access the network using a WiFi access point.</p>
<p>I hope it will be useful to some of you.</p>
<p><em>This post was initially published on <a href="https://medium.com/genymobile/gnirehtet-reverse-tethering-android-2afacdbdaec7">medium</a>.</em></p>
<p><em>Discuss on <a href="https://www.reddit.com/r/Android/comments/62lc8z/a_reverse_tethering_tool_for_android_no_root/">reddit</a> and <a href="https://news.ycombinator.com/item?id=14011590">Hacker News</a>.</em></p>