Research and development of systems that involve natural language understanding and knowledge.
Previously, worked on systems software, including research on operating systems, computer architecture,
distributed systems and cloud computing.
Currently, I am a Software Engineer at Google Research,
where I work with the Google Research Language team.
Previously, I was a Research Staff Member at IBM Watson Group as well as at IBM Research. I worked on the
Watson Concept Insights cloud service, having co-founded
the research project and later serving as lead software engineer and system architect.
I received a Ph.D. in Computer Engineering at the University of Toronto in 2012, advised by
Prof. Michael Stumm. For my Ph.D., I invented and developed mechanisms
to improve the performance of network and I/O applications, operating systems and run-time systems.
Projects
Natural Language Understanding at Google Research (2016 - present)
Working on a variety of projects related to Natural Language Understanding at Google Research (NYC). In conjunction with Google Cloud teams, extended language support for syntatic and entity analysis from three initial languages to eleven. Also worked internally to improve quality of entity recognition, entity typing and entity linking.
More recently, I have been working on information extraction and knowledge graphs, specifically learning how to generate neural representations of information, directly from text, with minimal or no direct supervision from existing knowledge graphs.
Relevant publications [17], [18], [19], [20], [21] and [22].
Watson Concept Insights (2013-2016)
System architect, lead software engineer and co-founder of “Watson Concept Insights” cloud service. Took the project from a research idea all the way to production. Worked on all aspects of research and development including algorithm and data research, system architecture and development, API design, pricing, dashboard UI, testing, and devops.
Resources:
Click for detailed information
- System architect, lead engineer and co-founder of "Watson Concept Insights" cloud service.
- Took the project from a research idea all the way to production. I was involved in all aspects of the project, including algorithm and data research, system architecture and development, API design, pricing, dashboard UI, testing, and devops.
- Co-authored over 15 patents in the areas related to cognitive computing, information retrieval and graph analysis
- Designed and developed REST API sustaining over 500,000 API calls per day in production.
- Developed micro-service based distributed system with over 10 independent services, spanning 100s of machines.
- Deployed and managed large installations of storage clusters, including Cassandra, MongoDB, Ceph/Rados Object Store and Redis.
- Developed novel storage library allowing transparent stacking ("union") of distributed object stores, as well as on demand object retrieval for distributed computational kernels.
- Significant portion of system code written in Go; some portions of algorithmic kernels in C/C++.
- Optimized C kernel from over 2 minutes of execution down to under 10 seconds. Optimized conceptual query latencies from over 30 seconds down to under 1 second.
- Developed monitoring solution of services and machines encompassing metrics, logs and health checks.
- Service used for ACM’s digital library author recommendation found on papers hosted at dl.acm.org
Service oriented peer-to-peer middleware (2012-2013)
Researched and prototyped novel peer-to-peer distributed system with the goal of offering core services
for the construction of service-oriented applications. The main goal of the project was to produce a suite of tools
and micro-services to make it easier for developers to write robust, large distributed applications or cloud services.
The project brought together a distributed key-value store with dynamic consistency and availability guarantees,
a membership/directory service, a topology aware messaging bus, a deployment system based on Linux containers (LXC),
and integrated with monitoring.
Click for detailed information
- Designed and developed 6 different scale-out peer-to-peer microservices in Go.
- Designed and developed fully distributed in-memory key-value store with variable consistence and availability guarantees (per-collection). Collections could be configured to use fully consistent (RAFT based consensus), or fully available (peer-to-peer protocol).
- Designed and developed highly-available distributed DHCP service for dynamically assigning IPs to VMs and containers within a data-center
- Designed and developed fully distributed (peer-to-peer) membership and directory service for storing cluster and service-level membership information.
-
- Designed and developed distributed network monitoring micro-service, allowing clients to query network failures within different regions (machine, rack, zone or data-center)
- Designed and developed distributed messaging service.
- Designed and developed distributed deployment system for micro-services using distributed fleet of agents, responsible for launching container or virtual-machine based run-times (using libvirt).
Cloud management resiliency (2012-2013)
Explored resiliency of cloud management systems, focusing on OpenStack.
Developed mechanism for monitoring and tracking distributed requests within OpenStack services. This mechanism was
used to log specific distributed flows in the presence of faults or crashes. Extended this basic mechanism to
introduce artificial faults at specific events, along with automatically validating expected outputs from specific
series of requests.
Relevant publications [14] and [15].
FlexSC (exception-less system calls) (2010-2011)
Research in the area of operating systems and computer architecture, focusing on run-time performance.
Created a novel operating system interface for traditional monolithic kernels (e.g., Linux), called
exception-less system calls, that enables applications to communicate with the operating system via
asynchronous messages. Created a new POSIX compatible threading library to support multi-threaded server applications
to efficiently use exception-less system calls (e.g., Apache, MySQL and BIND).
Created a new event-driven library to support explicitly asynchronous application/OS execution, and
ported memcached and nginx to this new library. Demonstrated that exception-less system calls leads to efficient
execution on multi-core processors.
Relevant publications: [10] and [11].
Runtime systems using hardware performance counter (2007-2010)
Created tools for exploring peformance of applications, run-time and operating systems using hardware performance counters.
The tooling and analysis I worked on allowed our research group to design novel techniques for improving utilization of processor cache.
The technique I pioneered was an operating system based system for improving processor cache performance called
"software pollute buffer". This system profiles application cache performance at run-time, through hardware
performance counters, and dynamically remaps application pages with poor cache utilization, using standard page-coloring.
I implemented this prototype within the Linux kernel, using the hardware performance counters from a PowerPC processor.
Relevant publications: [7], [8], [5], [6], [9] and [3].