Running Mac OS X under qemu/KVM on AMD CPU

Due to the excellent work of Gabriel L.Somlo it is possible to run the emulated Mac OS X on Linux under Qemu/KVM. The changes seem to be minimal, and the operating system emulation works well – as long as you have the Intel CPU, that’s it.

If you have only the AMD CPU, the emulation only works without the KVM, i.e. when you run qemu without the -enable-kvm option. With this option the emulation hangs on the grey screen with Apple logo. Enabling the verbose boot (-v option to Chameleon) shows an empty black screen instead.

This happens because Continue reading »

Uncategorized 19 Comments

Practical difference between epoll and Windows IO Completion Ports (IOCP)

Introduction

This article compares the difference between the use of epoll and Windows I/O Completion Port (hereby IOCP).

It may be of interest to system architects who need to create a high-performance cross-platform networking servers, and to software engineers porting such code from Windows to Linux or vice versa. It may also be of interest to the developers familiar with one technology who would like to learn more about another one.

Both epoll and IOCP are efficient technologies when you need to support high performance networking with a large number of connections. They differ from other polling methods in may ways such as:

  • They have no practical limitations (besides system resources) on the total number of descriptors/operations to monitor;
  • They scale well with a VERY large number of descriptors; each adds very little overhead comparing to other polling/notification methods;
  • They are suitable to a thread pool based model, when a small thread pool handles a large number of connections through a state machine;
  • They are not effective, burdensome and essentially useless if all you need is a couple of client connections; their purpose is to handle 1000+ simultaneous connection;

In short, those technologies are suitable for creating a networked server which has to concurrently serve a very large number of clients. However at the same time those technologies differ significantly, and it is important to understand how they differ.

Notification type

The first and most important difference between epoll and IOCP is when you can receive a notification.

  • epoll tells you when a file descriptor is ready to perform a requested operation.
  • IOCP tells you when a requested operation is completed (or failed to complete).

When using epoll an application:

  • Decides what action is required to be performed on a specific descriptor (receive data, send data or both);
  • Sets the polling mask for the descriptor via epoll_ctl
  • Calls epoll_wait which blocks until at least one monitored event is triggered. If more than one event is triggered, the function picks as many as it could.
  • Finds the event data pointer from the data union.
  • If the specific bits are set in the associated revent structure, initiates the specific operation (such as read, write or both)
  • After the operation completes (which should be immediate), proceeds with the data read or send more data if any;
  • Notably, the descriptor may have both events set at the same time, so the application can perform both read and write.

With IOCP an application:

  • Initiates the required action on a specific descriptor (ReadFile or WriteFile) with the nonzero OVERLAPPED argument. The system queues the operation and the function returns immediately (as a side note, the function may complete immediately but this doesn’t change the logic since even the operation which completed immediately still posts the completion notification; this could be turned off on Vista+ though).
  • Calls GetQueuedCompletionStatus() which blocks until exactly one operation is completed and posted the notification. It makes no difference if more than one operation completes, this function will only pick one.
  • Finds the event data pointer from the completion key and the OVERLAPPED pointer.
  • Proceed with the data read or send more data if any;
  • Only one completed operation could be got from a queue at the same time;

The difference between notification types makes it possible – and fairly easy – to emulate IOCP with epoll while using a separate thread pool. Wine project does just that. However it is not that easy to do the reverse, and I don’tt know of any easy way to emulate epoll with IOCP, and it looks rather impossible to keep the same or close performance.

Data accessibility

A networked server typically operates with connection objects which contain the socket descriptor as well as other linked data such as buffers. Typically those objects are destroyed when the relevant socket is closed. There are, however, certain limitations when using IOCP.

IOCP relies on queuing the ReadFile and WriteFile operations, which may complete sometime in future. Both read and write operate on buffers, and require the buffers passed to them to be left intact until the operation completes. More, you are not allowed to touch the data in those buffers. This places several important restrictions:

  • You cannot use a local (stack-allocated) buffer to read the data into, or send the data from , because the buffer must be valid until the operation completes, which typically happens after you leave the function and thus invalidate the buffer pointers;
  • You cannot dynamically reallocate the output buffer, for example if there is more data to send, and you decide to increase the buffer size. You cannot do this if there is a pending operation on this buffer,  because this would invalidate the buffer. You can create a new buffer, but cannot destroy the old one, and since you don’t know how much data would be sent, this makes the code more complex.
  • If you write a proxy application, most likely you would have to introduce the double buffering, because both sockets would always have an active operation pending, and you could not touch their buffers.
  • If your connection manager class is designed the way it could destroy the connection class anytime (for example when the connection class reported an error while processing the received data), your class instance cannot be destroyed until all the pending IOCP operations complete.

IOCP operation also requires the pointer to an OVERLAPPED structure, which also has to be kept intact – and not reused – until the operation completes. This also means if you need to do both reading and writing at the same time, you cannot inherit your class from the OVERLAPPED structure – a common design pattern. You would have to keep two structures as members of your socket class instead, passing one for use with ReadFile and another one for use with WriteFile.

epoll, however, does not use any I/O buffers, and therefore none of those issues apply.

Waiting condition modification

When adding a new event condition, both epoll and IOCP make it easy to add a new condition. Epoll allows the polling mask to be modified anytime, and IOCP allows to start the new operation anytime.

Modifying or removing an existing condition, however, is different though. Epoll allows to easily modify a condition by using a single epoll_ctl call. It could be performed from any thread, and will works safely even if other threads are waiting for the condition.

IOCP is much more burdensome. If an I/O operation is scheduled, it should be canceled first by calling the CancelIo function. This function could only be called by the thread which scheduled the operation (i.e. it cannot be called by a dedicated management thread), and the operation status is unknown until the GetOverlappedResult retrieves the operation status. As stated above, this also means the buffers are untouchable until it happens.

Another issue with IOCP is that once the operation is scheduled it cannot be changed. For example, you cannot modify the queued ReadFile and tell it you’d like now to read only 10 bytes and not 8192; you’d have to cancel and reissue it. This is not an issue with epoll which does not schedule the operations.

Non-blocking connect

Some server implementations (inter-linked servers, FTP, p2p) need to initiate outbound connections. Both epoll and IOCP have support for the non-blocking connect, although different.

With epoll rhe code is the same whether you use select, poll or epoll. You create a non-blocking socket, call connect() on it and monitor it for writing (EPOLLOUT) condition.

With IOCP you need to use a special function ConnectEx, because the connect() call does not accept the OVERLAPPED structure and therefore cannot be queried by the IOCP. So not only the code would be different from epoll, it would be different from Windows implementation which uses select or poll. Fair to say though that the change required is rather small and insignificant.

An interesting note that accept() works as usual with IOCP. There is an AcceptEx function,  but its role is completely different. It is not a non-blocking accept().

Event monitoring

Often there is more data has arrived while the original event condition is triggered. For example, the epoll which monitors the descriptor for reading, or IOCP waiting for the ReadFile to complete, are triggered by the socket receiving the first chunk of data. Now what happens if there are another few chunks of data arrived after that? Is it possible to safely retrieve them without polling?

With epoll it is possible. Even if you receive only one read event, you can loop read()ing from the socket until you either read less than a requested amount, or got the EAGAIN error (and if you use the epoll edge mode you must do exactly that). Same with sending the data, if your producer sends the data in small chunks, you can loop around the write() call until EAGAIN.

With IOCP it is not possible. To read more data from the socket you need to post another ReadFile or WriteFile operation, and wait until it completes. This may create additional level of complexity. Consider the following example:

  1. A socket class posted the ReadFile operation, and threads A and B are waiting in GetOverlappedResult()
  2. The operation has been completed, thread A got the operation result, and called the socket class to process the read data
  3. The socket class decided to read more data, and posted another ReadFile operation
  4. This operation completed immediately, thread B got the result and called the socket class to process the read data
  5. Now the read processing function is being called by two threads at the same time, with the execution order unknown.

There are of course a few ways to avoid this. First would be to have a lock per-class, but this introduces another issue. Locks aren’t unlimited, and if you need to support 100k concurrent connections, you may run out of locks. You would also lose some concurrency, because your execution path for processing the read data may have nothing in common with the execution path for processing the written data.

The usual solution is to have the connection manager class call the ReadFile or WriteFile for the class. This is better – and as a bonus, allows destroying the class if needed – but makes the code more complex.

Conclusion

Both epoll and IOCP are suitable for, and typically used to write high performance networking servers handling a large number of concurrent connections. However those technologies differ significantly enough to require different event processing code. This difference most likely will make a common implementation of connection/socket class meaningless, as the amount of duplicated code would be minimal. In several implementation I have done an attempt to unify the code resulted in a much less maintainable code comparing to separate implementations, and was always rejected.

Also when porting, it is usually easier to port the IOCP-based code to use epoll than vice versa.

So my suggestion:

  • If you need to develop the cross-platform networking server, you should focus on Windows and start with IOCP support. Once it is done, it would be easy to add epoll-based backend.
  • Usually it is futile to implement the single Connection and ConnectionMgr classes. You will end up not only with a whole lot of #ifdef’s but also with different logic. Better create the base ConnectionMgr class and inherit from it. This way you can keep any shared code in the base class, if there’s any.
  • Watch out for the scope of your Connection, and make sure you do not delete the object which has read and/or write operations pending.
Uncategorized Leave a comment

select / poll / epoll: practical difference for system architects

When designing a high performance networking application with non-blocking socket I/O, the architect needs to decide which polling method to use to monitor the events generated by those sockets. There are several such methods, and the use cases for each of them are different. Choosing the correct method may be critical to satisfy the application needs.

This article highlights the difference among the polling methods and provides suggestions what to use.

 

Polling with select()

Continue reading »

Linux , , , 4 Comments

Failed to load steamui.so ?

Recent Steam update switched it to SDL2, so unless you have the very latest libSDL2 installed, you’ll get an error while trying to load Steam:

Fatal error: Failed to load steamui.so

The short test will quickly tell you the problem:

LD_LIBRARY_PATH=$HOME/Steam/ubuntu12_32 ld $HOME/Steam/ubuntu12_32/steamui.so

which will print a bunch of lines starting from:

ld: warning: libSDL2-2.0.so.0, needed by /home/tim/Steam/ubuntu12_32/steamui.so, not found (try using -rpath or -rpath-link)

This means you need to install libSDL2 In case of openSuSE even the latest version (12.3 as of now) does not come with libSDL2. Fortunately it is quite easy to build it.

Make sure you have Mercurial and the development packages installed, and compile and install SDL2 by doing the following:

hg clone http://hg.libsdl.org/SDL SDL2
cd SDL2
mkdir build
cd build
../configure
make
cp build/.libs/libSDL2-2.0.so.0.0.0 $HOME/Steam/ubuntu12_32/libSDL2-2.0.so.0

Now your Steam will work again.

Steam Leave a comment

Installing Steam at the unsupported Linux which is not Ubuntu

Today at Feb 14th Valve released Steam for Linux. At this moment it officially only supports Ubuntu. However it is easy to run it on any other Linux distribution, in my case at OpenSuSE 12.2.
Continue reading »

Linux, Steam Leave a comment

Preventing WordPress comments spam

There seem to be an easy way to prevent a significant number of WordPress comment spam.

The majority of spam comments nowadays come with either a bunch of URLs, or with a generic message such as:

Hi there! Just discovered your site while i was browsing and i must say that i found it quite interesting! I hope you don’t mind if i return here from time to time and check your content.

Those messages usually do not have any URLs. The spammer attempts to achieve their goal by setting up the “Website” comment field, pointing it to their spam site.

The easiest solution seem to be just to remove this field from the comment form. This could be achieved in one of the following ways, and none of them reduces the spam:

  • Remove the Website field from the comment form. This doesn’t change anything since most spammers use the software which doesn’t even look at the comment form and just sets the fields which “should be there”. And since the WordPress code still handles the “url” field, the spam comment gets through same way as before.
  • Remove the url field from the comment form altogether, in hope the spammers would see their added comments come with no website so they’re useless for the purpose, and will leave you alone. Again, this is not how spammers work, they do not track posted comments (most of which got removed in seconds anyway), so it does not reduce spam. If you’re using Akismet it also comes with the major disadvantage – the website field is a major source for spam detection, so the comments with the same content but without this field set are not detected as spam anymore.

So the idea is to turn the spammer logic against them.

First we disable – but not hide – the Website comment field by adding the disabled attribute into the field value. This could be done by changing the wp-includes/comment-template.php the following way:

                'url'    => '<p class="comment-form-url"><label for="url">' . __( 'Website' ) . '</label>' .
                            '<input id="url" name="url" type="text" disabled value="' . esc_attr( $commenter['comment_author_url'] ) . '" size="30" /></p>',

The disabled field is added between “text” and value fields.

Second, we refuse any comments which still contain the Website field. Since the regular users cannot enter the website anyway (the field’s disabled) but the spam bots ignore this restriction, the only entities who would be able to pass a non-empty Website field would be the spam bots. So we check if a new comment comes with the non-empty website field and block it. This could be achieved by hooking into the WordPress system to intercept a new comment being posted.

To do so, add the following code into wp-content/<your theme name>/functions.php:

function must_have_no_url_field($fields)
{
        if ( !empty( $_POST['url'] ) )
        {
              wp_die( "Spammers not welcome here" );
        }
}

add_action( 'pre_comment_on_post', 'must_have_no_url_field' );

This function is being called each time a new comment is posted, and prevents the comments with non-empty Website field from appearing. At the same time it keeps the value of this field intact when submitting the comments to Akismet, therefore keeping the spam detection rate high while preventing the comments which slipped through from being posted.

Uncategorized Leave a comment

Help to fight Internet censorship!

Since Nov 1st 2012 the new Russian law implementing the Internet censorship comes into effect. This law allows several agencies of the Russian government to add any Internet site they consider “harmful to children” into the government-mandated block list. Russian Internet providers are legally obligated to block access to the sites which are present in this list. Quick summary of the law:

  • The sites could be blocked by either the court or one of the government agencies. The law currently allows blocking for three categories of web sites: child pornography, drug propaganda and the web sites about the suicide.
  • The law provides no oversight and no penalties for the government employees who add the specific site into the block list. Further, the list itself is secret and only available to the Internet providers.
  • The law requires the government to notify the site owner and let the owner to remove the content in three days. This, however, does not happen, and the sites get blocked without any advance warning.
  • It requires a court order to remove the site from the blocked list, while it could be added there simply by some government clerk.
  • The access is blocked for everyone, even the adults who don’t have any children. The block is mandatory.

Internet is the main vehicle fueling the democracy in Russia. Popular social networking sites  such as LiveJournal and Facebook/VKontakte are widely used by the opposition to coordinate the peaceful protests, uncover the major corruption scandals and simply exercise their free speech rights by sharing their opinions which are censored from the Russian government-maintained TV channels. Therefore a lot of Russians are worried the real purpose of a new law is to quickly shut down the resources the opposition uses to fight the Putin regime. During the first days the law went into effect it blocked a few political satire sites, and a site about the suicide prevention. This is just start.

It is a worrying trend when governments limit the Internet access for adults under the guise of “saving our children”. However we can help Russians to fight the Internet censorship.

What can you do:

  1. Spread the word! Tell others about the censorship and how to work around it.
  2. Set up Tor software or I2P software and run an exit node or an intermediate node. This will help the people to reach censored sites. We are running the Tor node here at Ulduzsoft.
  3. Donate to NoiseBridge or similar organizations which run the Tor exit nodes for everyone to use.
  4. Educate your friends  about the effects of the censorship on the society. Censorship is a very attractive option to any government, we must be vigilant to preserve our right to free speech!
Uncategorized Leave a comment

Reverse-engineering the KaraFun file format. Part 4, the encryption

So far the files we have seen had no encryption. However some of our users pointed out there are some files which are encrypted. While the encrypted KFN files were still analyzed and dumped properly, the resulted files were unreadable. Of course the player need to support those files too, so this is something which we need to take care of.

First let me start with a statement that reverse-engineering the encryption is typically a very difficult task. Especially when the encryption keys are unknown, where it requires reverse-engineering the software itself. However as you see below due to a few major flaws in the KaraFun software it is still possible to reverse-engineer even the encrypted files without dealing with the software itself.
Continue reading »

android, reverse engineering 9 Comments

Reverse-engineering the KaraFun file format. Part 3, the Song.ini file

This is quite simple. We look at the song.ini file and it is obvious immediately where the text and the timing information is as those are the only lines with enough numbers.
Continue reading »

android, reverse engineering Leave a comment

Reverse-engineering the KaraFun file format. Part 2, the directory

In the first part we found out the header format, and that it does not provide us with the directory location. However we know there must be a directory, as the KaraFun application must know where exactly in a file the files are stored, and how large are they. At minimum there should be the directory offset and either the total size or the number of files. At the first thought the DIFW header value may contain the number of files, and the MUSL value contains the directory offset (its value is 0x11D which is after 0×117). However if we check other KaraFun files at the same page, we would see that for some files the MUSL value is less than header length. Therefore it cannot be the offset, and probably is the music length in seconds. Nor DIFW is the number of files. A quick search for the JPEG signature “JFIF” finds out at least three JPG files, so there are more than two files in this archive.

So where it is the directory? Since the header length varies (because it uses the strings with variable length), it could be in one of two places. Either it is at the end of the file (not the case as we saw above), or it is supposed to follow the header directly. Let’s look carefully at the bytes following the header:
Continue reading »

android, reverse engineering Leave a comment