Projects

  • Behind Density Lines: Machine Learning and Citizen Scientists in Quantifying Scanning Electron Microscopy Images. (2023-)

  • Leveraging Large Language Models to Understand the Financial Behavior of Women and Minority Groups. (2023-)

  • Data-centric MLOps for Image Segmentation in Cell Organelles. (2022-2023)

  • Data-Centric Approach to Digital Twins of the Built Environment. (2022-2023)

  • Council on Library & Information Resources: Entomo-3D: Digitizing the Virginia Tech Insect Collection. (2020 - present)

  • Virginia Tech Digital Libraries Platform (2018-Present)
                                              A Multi-Tenancy Cloud-native Digital Libraries Platform to manage digital assets in the Virginia Tech Libraries. This platform is developed in a serverless architecture with microservices interacting with backend AWS managed services in the AWS cloud.
    Techniques: AWS, Amplify, API Gateway, Lambda, Cloudformation, DynamoDB, ElasticSearch, GitHub Actions, Microservices, Serverless, Cloud-native


    IAWA SWVA DLP-access
    AWS Amplify
    AWS Batch IIIF generator ID Minting Service Resolution Service
  • Fedora open source repository project (2014-Present)
    Fedora is a robust, modular, open source repository system for the management and dissemination of digital content. It is especially suited for digital libraries and archives, both for access and preservation.
    Techniques: Java, Maven, Ansible, Docker, AWS, Kubernetes


    Fedora 4
    Fedora 5
    Fedora Kubernetes Fedora 4 Ansible Fedora Docker
  • Virginia Tech Libraries - CollabVT and VIVOHarvester (2018-2019)
                                              CollabVT is a project to collect and expose faculty scholarship, grants, and other information in a public-facing profile system. The system is developed on top of VIVO. A VIVOHarvester tool is developed to harvest faculty data from the Symplectic Elements.
    Techniques: Python, Java, SPARQL
  • Cloud-native Data Analytics application for ETDs (2018), Research project
                                              A Cloud-native data analysis application for librarians to explore useful information from the ETDs preserved in the Virginia Tech. This application is developed in a serverless architecture with microservices and managed services as backend, and deployed on AWS.
    Techniques: Microservice, ElasticSearch, Kibana, AWS, Lambda, DynamoDB, S3
  • Virginia Tech Libraries - GeoData (2016-2018)
    GeoData provides information about geospatial data that Virginia Tech Libraries has pledged to serve on behalf of local government organizations and data that has been acquired through research.
    Techniques: GeoBlacklight, Ruby, Solr, Ansible, Vagrant


  • Virginia Tech Libraries- VTechData (2015-2018)
    Virginia Tech’s Data Repository is a platform for highlighting, preserving, and providing access to the work generated by the Virginia Tech Community.
    Techniques: Ansible, Ruby, Fedora 4, Solr, Sufia, Vagrant


  • ETDplus - Supporting the evolution of ETD research products (2014-2017)
    Web based tool designed to assist students in preparing and packaging ETD supplementary materials for long-term preservation and access.
    Techniques: AWS, EC2, Ansible, Ruby, Fedora 4, Solr, Sufia, Vagrant.


  • FishTraits Database (2014)
    Web based tool designed to assist students in preparing and packaging ETD supplementary materials for long-term preservation and access. This project was developed based on .
    Techniques: Django, Python, AWS, EC2


  • Ensemble: Enriching Communities and Collections to Support Education in Computing (2008-2014), Research project
                                                            Ensemble is a NSF funded project and a distributed portal for computing education. This portal provides access to a broad range of existing educational resources for computing while preserving the collections and their associated curation processes.
    Techniques: Drupal, Solr, Machine learning, AWS, EC2, Lambda, RDS, CloudFront, S3


  • A Digital Library for Recovery, Research, and Learning from April 16, 2007 (2008-2009), Research project
    • Developed a digital library to archive news, photos, and videos related to the VT 4/16 event. Implemented texts and image search features using Lucene and developed crawlers to collect news, photos, and videos from the Web
  • Data Communication Branch, Chunghwa Telecom Co., Ltd., Projects (2002-2007)
    • Managed and developed three enterprise software development projects, including an email advertisement platform supporting over 2 million Telecom members, an online payment service which serves over 3 million active users, and an eGovernment document system - a digital repository for government documents and integrating LDAP web applications.

Awards

  • Amazon AWS Cloud Credits for Research Grants, June 2017 – June 2018
  • Amazon AWS in Education Research Grants, Aug 2010 – July 2011
  • AT&T Big Mobile on Campus Challenge, 5th place. 2009