June 14, 2014

Difference between Async and Task

What are the differences between Async and Task? I think this is something to analyze in finding out how it might compare Erlang to F#.

Josh from work told me of an article by Oren Eini, the creator of NHibernate and Raven DB. The gist of it was that Task, which is the preferred method of creating threads, was not great to use with HttpClient. Instead, the article swapped it for async, and it sped up like a pro! Why? Threading makes everything awesome right? Sorta. This gets back to IO vs CPU bound threading.

IO bound processing is something that is waiting for network, disk or some other task that calls in and out of the process. CPU bound processing is like crunching Fibonacci sequence. IO bound processing, you have to wait because no matter how many CPU’s you have, it will not speed up the network or disk. CPU bound processing needs, well the CPU.

Knowing these 2 concepts, there is a difference between being asynchronous and being multi-threaded. Async keyword actually runs your code synchronously, until it finds await. When await is found it adds in code to pause that thread and not block any other thread from passing messages to it. A concept that is familiar in node and Erlang. When the operation returns, it knocks out the operations. This is done, without the use of ThreadPool. There is a cool article on it on MSDN here. Task Parallel Library, on the other hand, spins up heavy ass threads that take time to create, which may cause slow processing.

Going back to Oren’s article, async isn’t creating more threads, it is just waiting for responses and giving answers when it gets a result back. The task example was spinning up ALL 500 threads at once, and it took a while. When Oren put the async in the Task Factory start, it made it fast again because it wasn’t spinning up 500 threads, but the “same” thread 500 times. I put quotes there because ThreadPool, behind the scenes, will reuse the thread.

To me, sounds like Erlang is great for this IO communication, because it can send and receive massive amounts of messages. F# can do this too, but MailboxProcessor and Async only cover 2 of the advantages of how it can process insane amounts of messages. I think in the next articles I will start looking into creating separate standalone processes and distribution models from Erlang. Maybe F# has something?