Meeting IFIP WG 10.3 in Yorktown Heigths, NY on December 8, 2017


9.00 Opening and Coffee

Prof. Guang R. Gao

MICRO 50 Years: a Personal Reflection on Dataflow Models of Computation


Dr. Kemal Ebcioglu

On the Future Evolution of Application-specific Supercomputers in the Cloud 

10.45 Break

Prof. Alex Shafarenko

AstraKahn: Refining Applied Kahn's Networks

11.45 Lunch Break

Prof. Sally A. McKee

Some Like it Cold: Initial Testing Results for Cryogenic Computing Components


Prof. Vladimir Lazarov


Prof. Nader Bagherzadeh

An overview of two of our efforts related to machine learning

03.15 Afternoon Coffee

WG 10.3 Business

05.00 Closing

Dinner at Lefteris Gyro restaurant in Mt Kisco, NY

Prof. Guang R. Gao: MICRO 50 Years: a Personal Reflection on Dataflow Models of Computation

This talk starts with a personal reflection at the historical event of MICRO-50 anniversary as the 2017 recipient of Bob Ramakrishna Rau Award. The speaker provides a review on the legacy of Bob Rau’s contribution as well as personal interaction with Bob --- both have produced lasting impact in his own career on compiler and microarchitecture technology in the past 30 years. Facing the challenges of the end of Moore’s Law, as well as the challenges from applications in advanced data analytics and machine learning the speaker believes that it may be the time to initiate a new forum encouraging broader participation and direct interaction of scientists and engineers working on microarchitectures, compilers, hardware synthesis, runtime systems, and OS as a co-design community. He hopes that IFIP 10.3 Working Group on Concurrent Systems should be an excellent forum to start this interaction and conversation.

The speaker stresses that it is his personally view that 2017 Bob Rau Award is beyond himself – it is also a recognition of the seminal contribution and impact of the dataflow model of computation pioneered by Jack B. Dennis and Arvind at MIT - both are his mentors as the first Ph.D recipient of Computer Science at MIT from the People’s Republic of China since 1980.

Dr. Kemal Ebcioglu: On the Future Evolution of Application-specific Supercomputers in the Cloud 

In this talk, we will describe a vision for next generation, easy-to-create, power efficient, custom hardware-accelerated cloud computing data centers with exascale computation capabilities in both technical and non-technical applications.

In our approach, high productivity in application-specific cloud hardware design is achieved by our C++ compiler technology which, unlike today's C-to-gates compilers, creates from general, ordinary single-threaded code (without specifying hardware structure) a customized supercomputer hardware system partitioned into multiple chips, where custom hardware components are interconnected by a plurality of lightweight scalable networks, instead of ordinary wires. Application memory is partitioned using compiler dependence analysis, removing the need for maintaining coherence between memory partitions, thus reducing hardware complexity and improving power efficiency. Furthermore, highly-parallel pre-emptive scheduling for sharing custom hardware resources in an application-specific cloud data center is done entirely by hardware.

Prof. Alex Shafarenko: AstraKahn: refining applied Kahn’s networks
This is a short talk describing work in progress. We propose a refinement of applied Kahn’s Process Networks,
where vertices are structured into network templates according to their concurrency properties. The end result 
is the derivation of finite state machines that synchronise streams, and which can be analyzed using the
probabilistic state automaton techniques known in Machine Learning. It is our position that this form of structuring
has the potential to facilitate automatic coordination of streaming networks both in the large (Web service networks) and
in the small (NoC).   

Prof. Sally A. McKee: Do Superconducting Processors Really Need Cryogenic Memories? The Case for Cold DRAM
Cryogenic, superconducting digital processors offer the promise of greatly reduced operating power for server-class computing systems. This is due to the exceptionally low energy per operation of Single Flux Quantum circuits built from Josephson junction devices operating at the temperature of 4 Kelvin. Unfortunately, no suitable same-temperature memory technology yet exists to complement these SFQ logic technologies. Possible memory technologies are in the early stages of development but will take years to reach the cost per bit and capacity capabilities of current semiconductor memory. We discuss the pros and cons of four alternative memory architectures that could be coupled to SFQ-based processors. Our feasibility studies indicate that cold memories built from CMOS DRAM and operating at 77K can support superconducting processors at low cost-per-bit, and that they can do so today.

Prof. Nader Bagherzadeh: An overview of two of our efforts related to machine learning

We are working on heterogeneous device mapping of neural networks
utilizing Tensorflow.  By using genetic algorithms for the search space
and machine learning algorithm to predict the makespan of the neural
network.  In another more low level hardware optimization work, we are
using approximate log multiplier to make CNN computations more energy