Big Data Analytics

Research Group, Computer Science, University of Hull


We are the Big Data Analytics research group (BDA) in the Department of Computer Science and Technology (CST) at the University of Hull. We collaborate with the Dependable Intelligent Systems (DEIS) research group in CST, and also work closely with other parts of the university, including the Viper HPC team, the Energy and Environment Institute, and the Aura Centre for Doctoral Training Centre in Offshore Wind Energy and the Environment.

We first came together as a group in 2017 and have since been growing an interdisciplinary research portfolio in artificial intelligence and data analytics, particularly focusing on research in (deep) machine learning, interactive systems and natural language processing, and increasingly AI for sustainability and environmental modelling. We have an increasing number of collaborations with partners in academia, industry and the public sector, generating novel fundamental research, knowledge transfer and impact acceleration projects.

Our Research

Deep Learning and Data Analytics

We apply machine learning, mostly deep learning, in all our research projects, due to its ability to learn features from large datasets and identify patterns in disparate, large and noisy datasets. Most of our work on natural language processing and other time-series applications uses LSTM/GRU or transformer models (e.g. for sentiment analysis, natural language generation) while other projects in image analysis and sensory output analysis have used CNNs or autoencoders.

Our research in this area focuses on deep learning models that are transparent and interpretable and can explain the rationales behind their actions - this is important in many commercial and real-life applications to increase trust, helps to avoid bias and discrimination in AI models, and makes them generally more accountable.

We are also interested in fast learning systems and investigate topics in transfer learning, divide-and-conquer approaches to learning and efficiency. See our deep learning publications for details.

Natural Language and Social Media

Natural language processing (NLP) refers to approaches that analyse, generate and process natural language automatically using computational techniques and algorithms, mostly based on machine learning. Our strengths lie in natural language generation, that is the automatic generation of language from non-linguistic data sources, such as data-to-text generation, (spoken) dialogue systems, i.e. systems that engage in conversations with users to solve problems or provide information, sentiment analysis in text and social media and text classification and analysis.

We are also interested in social media analysis, including projects on sentiment analysis and mental health based on linguistic profiling, flood forecasting and emergency situation assessment from social media and topics in digital conservation, such as public opinion mining and the monitoring of species. See our natural language processing publications for details.

Interactive Systems and Decision Support

Interactive systems can interact with their users in different ways, including text, speech, haptic or other multimodal input. A common distinction is between task-oriented interactive systems that target a particular domain, e.g. answering user queries on restaurant or tourist attractions in the local area, and open-domain systems who attempt to answer any question, e.g. using information retrieval methods from the web. Most personal assistant fall into the latter category. Finally, chatbots have no purpose of informing but rather aim to entertain users by engaging in casual chat.

Our group is interested in human-like behaviour of interactive systems, e.g. the incremental planning of utterances and the analysis of linguistic features in different conversational contexts. Most recently we have worked on interactive decision support systems for various real-life applications, most notably in the wind domain, where we use interaction as a way to provide easy and intuitive access to the inner workings of wind turbines to support operations and maintenance. See our interactive systems publications for details.

Sustainability and the Environment

We are applying AI and machine learning techniques to help solve problems in sustainability and environmental research. We are part of the EPSRC Aura CDT on Offshore Wind and the Environment, and we have a strong interest in the digitalisation of offshore wind. As part of this, we use AI to improve operations and maintenance of wind turbines to improve their reliability and usability as an alternative to fossil fuel-based energy sources. We achieve this through detailed and explainable fault forecasting and the automatic generation of maintenance strategies from expert knowledge sources.

Other research includes the forecasting of floods from high-frequency and social media data sources and the monitoring of water quality in the North Sea from sensory data and satellite images to achieve longer-term forecasting of the effects of human activity or environmental factors on water quality. We are currently growing our research on using AI towards the conservation of species. See our sustainability publications for details.


Rapid prototyping of text analysis via domain transfer

This is an impact acceleration project that enables rapid analysis of text documents from interactive feedback and sparse human labels, by leveraging our previous research in NLP and language models.

HEIF, 2022-23 (PI)

Physics-informed machine learning for rapid fatigue assessments in offshore wind farms

This project proposes an industry-compatible step-change advance in accumulated fatigue assessment via novel integration of physical modelling and machine learning.

With the EEI and University of Sheffield.

EPSRC Supergen ORE, 2021-22 (PI)

Read more

KTP with Reckitt

We develop novel algorithms to improve stability testing of new product formulations and accelerate their launch from initial conception and design through to testing and manufacturing. This will accelerate product improvement and enhance Reckitt's capacity to react to customer feedback and product reviews.

Innovate UK, 2021-23 (PI)

Aura CDT

We are part of the EPSRC Aura CDT on Offshore Wind and the Environment and research theme lead for "Big data, sensors and digitalisation for the offshore environment".

With partner institutions Sheffield, Durham and Newcastle.

EPSRC, NERC 2019-27 (CI)

See Details

KTP with Spencer Ltd

We apply research in data analytics, NLP and AI to analyse company documents, identify patterns, extract information and generate insights towards more effective workflows and informed decision support.

Innovate UK, 2020-22 (PI)

SESAME - Secure and Safe Multi-Robot Systems

Research to address the openness, uncertainty, variability, and interplay of safety and security in multi-robot systems.

See Details

EU H2020, 2021-24 (CI)

Deep Text Generation from Knowledge Graphs

We research hierarchical decomposition approaches for fast learning in natural language generation systems that can produce high-quality outputs from sparse and unseen knowledge graphs.

Diffbot (industry grant), 2017-19 (PI)

Text mining of legal documents using an interactive learning approach

The key aim of this project is to semi-automate the analysis of legal documents with an intelligent software that can extract prominent themes from a document, identify likely areas for legal attention and generate a list of suggested amendments to the document..

With legal firm rradar.

HEIF, 2019 (PI)


Dr Nina Dethlefs
, Senior Lecturer   
Natural language processing, interactive systems, artificial intelligence
    Full profile

    I am a Senior Lecturer in Computer Science at the University of Hull, Yorkshire, UK, where I lead the Big Data Analytics Research group. I am also currently Director of Research for Computer Science and Technology and Aura CDT Theme lead for "Big data, sensors and digitalisation for the offshore environment". Before coming to Hull, I was a Research Fellow in the The Interaction Lab at Heriot-Watt University, Edinburgh. I have a PhD in Computational Linguistics from the University of Bremen, Germany.

    My research interests lie at the intersection of machine learning and natural language processing (NLP), particularly in the areas of data-to-text and natural language generation (NLG), interactive systems, assistive technologies, domain transfer and adaptability for data analytics in a wider AI context. I have spent the last few years working with neural networks as a primary algorithm family but have previously worked with graphical models, clustering and reinforcement learning. Most recently I have become interested in applying AI and NLP in "useful" contexts such as mental health and the environment, particularly towards sustainability. I am interested in the digitalisation of the offshore wind industry to make wind turbines more reliable. I am also interested in the effects of human activity on water quality and the forecasting of natural events such as floods. When I have time I do some research in digital conservation using AI and text classification.

Dr Joyjit Chatterjee
, KTP Associate
Artificial intelligence, wind turbines, green energy, natural language processing
    Full profile

    Joyjit is currently a postdoctoral researcher and KTP Associate with the University of Hull and Reckitt. He is developing novel AI-based techniques to automate certain stages in consumer care product testing. Joyjit holds a PhD from the University of Hull (2021).
    His PhD focused on researching in the area of tackling climate change with AI, by helping make wind energy sources more reliable, through explainable and intelligent decision support in their operations & maintenance. During his PhD., he closely collaborated with leading wind farm operators and research organisations as a part of the Aura Innovation Centre, UK’s PhD. cluster. He is a Chartered Engineer, MIET, MIEEE, AMIE, Jr. Member of Isaac Newton Institute for Mathematical Sciences, Cambridge & Life Member of Indian Science Congress Association. He is a certified Professional Scrum Master (PSM-1) with a sound understanding of the industry. Joyjit also holds a Postgraduate Diploma in Research Training (Hull, UK), Professional Certificate in AI (Oxford, UK), and Bachelors in Electronics & Communication Engg. (Gold Medalist-Amity University, India). He has published an array of papers in Computer Science & Engineering (including in leading A*/A ranked AI conferences/workshops such as NeurIPS, KDD, IJCNN, ECAI etc., and reputed journals by Wiley, IOP etc.). As a global researcher, he has co-authored papers & filed patents with reputed academicians from across Europe, North America and Asia in AI, NLG & Signal Processing, and delivered talks at leading research institutions globally (such as Carnegie Mellon University, ETH Zurich etc.). Joyjit was also a Visiting Researcher at Tamkang University, Taiwan on a government funded research project. Joyjit regularly serves on the programme committee/as reviewer for leading international conferences/workshops (such as ICLR, NeurIPS etc.) and high-impact journals. Joyjit is the recipient of several honours & awards including PhD. Scholarship (Hull, UK), Employability Award for Postgraduate Researchers (Hull, UK), funding towards conference presentations from reputed international organisations such as IEEE CIS, Microsoft Research etc., Scholarship for the Global Impact Challenge 2020 (Nudge, Netherlands), Young Researcher Award (IEEE UK & Ireland Section), Gold Medal and Scholarship (Amity University, India), Alumni Achiever Award for Outstanding Contribution in Education (Amity University, India) and several other awards, certifications and honours from governments, industry and academia. His research interests span Deep Learning, Natural Language Generation, Data Analytics, Knowledge Graphs, Causal Inference and Time-Series Analysis.

    See Joyjit's blog articles on The Good AIfor further background on his thesis and the use of AI in renewable energy.

Dr Bikash Gyawali
, KTP Associate   
Artificial intelligence, data analysis, natural language processing
    Full profile

    I am a researcher working in Natural Language Processing and Big Data Systems. Over the last five years, I have worked as a full time researcher on several topics related to Natural Language Understanding and Generation, Deep Learning, Big Data Analysis and Text Mining of scientific documents.

    Bikash is currently an KTP Associate with Spencer Ltd and is applying modern deep learning techniques to text analysis and document classification to optimise Spencer's workflows and productivity. Bikash holds a PhD from the Université de Lorraine in France (2015).

Dr Robert Houseago
, Research Associate,   
Environmental modelling, coastal geomorphology, data analytics
    Full profile

    Bobby is a coastal geomorphologist interested in the physical processes influencing coastal dynamics and delta morphodynamics. Research projects incorporate: coastal vegetation, sediment fluxes, and hydrodynamics. His current primary research project investigates the role of seagrass on hydrodynamics and wave attenuation via experimental flume research, validated by field measurements as part of international collaboration within the Hydralab+ protect. A range of interdisciplinary data collection techniques are applied to the research in order to quantify complex interactions and non-linear processes associated with vegetation, wave dynamics and sedimentation.

    Bobby is a member of the Energy and Environment Institute and a postdoctoral researcher on our EPSRC Supergen ORE grant, working on hybrid methods for fatigue forecasting in wind turbine monopiles, bridging physical and machine learning methods.

Dr Yifei Wang
, Research Associate   
Natural language processing, AI, deep learning
    Full profile Yifei is an expert in text mining and is currently a Research Associate in Natural Language Processing, focusing on text analysis. He is working on a range of different text analysis tasks, including named entity recognition and information extraction, and investigating novel techniques to learn learn robust deep learning models from small datasets - due to limited labels or general data sparsity, e.g. lack multilingual support of existing models.
    Yifei holds a PhD from the University of Manchester (2021), where he researched the use of radical features in Chinese medical text mining.

Lydia Bryan-Smith
, PhD student   
Artificial intelligence, machine learning, environmental modelling, social media analysis
    Full profile

    Lydia is a PhD researcher in Computer Science at the University of Hull, previously having graduated from Hull with an MSc. Her PhD project investigates deep learning algorithms for real-time flood prediction from multiple disparate data sources. An underlying idea of Lydia's research is to combine traditional hydro-dynamic flood models with modern AI techniques to achieve a balance between high accuracy of forecasts and efficient predictions. The learning models are trained with information from different sources including heights maps of the geographical area, dynamic rainfall radar data, as well as social media posts.

Onatkut Dagtekin
, PhD Student   
Artificial intelligence, machine learning, environmental modelling, natural language processing
    Full profile Onatkut is currently a PhD student in Computer Science at the University of Hull. He holds an MSc degree from the University of Manchester. For his PhD project, he investigates deep learning techniques to discover key predictors of water quality from high-frequency sensor outputs. These techniques aim to be as much as possible independent of particular geographical locations but rather rooted in the environmental context in which they occur. This can make them potentially transferable across different water bodies and regions of the world, including e.g. phenomena such as proximity of water to farmland, caves, industrial activity, impact of natural phenomena, such as land slides, floods, and other weather events. The overarching goal of the project is to be able to understand and predict the impact of human activity on aquatic ecosystems and wildlife.

Luana Mincarelli
, PhD student   
Marine biology
    Full profile

    The aim of my PhD project is to investigate how different combined stressors act in the environment, with particular attention to biological responses in mussels.

    I obtained an MSc in Marine Biology at Marche Polytechnic University (Ancona, Italy), working on a project concerning the modulatory effect of climate change (global warming and ocean acidification) and heavy metals exposure on biological responses of the Mediterranean mussel Mytilus galloprovincialis.

    The first part of my PhD involves the exposure of the blue mussel Mytilus edulis to several stress compounds. At the moment, I’m studying how different temperatures and phthalate exposure can affect biological responses of blue mussels. Finally, the second part of my project will involve data analysis: at the end of my PhD, I would like to have collected enough data to use artificial intelligence and machine learning for simulating future climate conditions and consequence for mussel populations.

Ervands Mumdzjans
, PhD student   
AI, deep learning, time-series analysis
    Full profile Erv is currently researching efficient ways to model sequencies and time series.

Samuel Rose
, PhD student   
Deep learning, natural language processing
    Full profile Sam works on the automatic processing and analysis of text to identify indicators of dyslexia.

Callum Rothon
, PhD student   
(Deep) Computer Vision, Structural health monitoring, offshore wind
    Full profile Callum is part of the Aura CDT in Offshore Wind Energy and the Environment and originally a mechanical engineer by degree. His PhD focuses on structural health monitoring of wind turbines with a specific focus on damage detection from images.

Victoria Sherratt
, PhD student   
Natural language processing, AI, deep learning
    Full profile Victoria is researching a computational semiotics framework for the analysis and generation of memes.

Eva Sousa
, PhD student   
Artificial intelligence, medical imaging
    Full profile

    I Licentiated in Nuclear Medicine in 2006 from Escola Superior de Tecnologia da Saúde de Lisboa – Instituto Politécnico de Lisboa (ESTeSL-IPL). Following, I finnished my MSc in Biophysics and Medical Physics in 2009, from Instituto de Biofísica e Engenharia Biomédica, da Faculdade de Ciências da Universidade de Lisboa (IBEB-FCUL. Concluded one Biomedical Engineering Specialization in 2014 from Faculdade de Ciências e Tecnologias da Universidade de Coimbra (FCTUC). I am presently student PhD Student in University of Hull.

    From September of 2008 to September of 2018, I was full-time Assistant Lecturer in ESTeSL-IPL. Lecturing in the degrees of Nuclear Medicine and Medical Imaging and RadiotherapyDuring this time, I collected some experience of lecturing in different Institutions abroad under Erasmus Program in 2010 (Karolinska Institutet – Sweden); 2012 and 2013 in NuPHiCos (Medical University of Plovdiv – Bulgaria); 2015 (Institute Paul Lambin – Brussels). And From 2012 to 2018 I coordinate the entrepreneurship Program of ESTeSL-IPL.

    I collaborated with some research teams and projects and have being member of Member of the Research Groups GI-MOSM and GIRES. I have collaborated in some funded research projects by Instituto Politécnico de Lisboa, with the fellowship (IDI&CA) do IPL – 2016.

    As hobbies I am also a writer, blogger and volunteer.


Dr Alex Turner
, Affiliate, now Assistant Professor at Nottingham University   
Artificial intelligence, healthcare, complex dynamical systems

Dr Annika Schoene
, PhD student, now Research Fellow at NaCTeM, University of Manchester.   
Deep learning, machine learning, sentiment analysis, natural language processing

Dr Darryl Davis
, Senior Lecturer, now freelancing / enjoying retirement
Artificial intelligence, machine learning, data analytics, robotics

Albert Dulian
, PhD student, now Deep Learning Scientist at Vicon   
Deep learning, computer vision, reinforcement learning, self-driving vehicles

Dr George Lacey
, PhD student   
Deep learning, genetic algorithms, complex dynamical systems

Dr Ric Colasanti
, Postdoctoral Researcher, now at Bournemouth University
Artificial intelligence, machine learning

Joshua Bee
, MSc (Research) student, now at META
Deep learning, computer vision

News & Activities

Forbes 30 Under 30

We are proud that our very own Joyjit has been named in Forbes 30 under 30 Europe this year under the Manufacturing and Industry category - people that create the products, methods and materials of tomorrow! Congrats Joyjit, well deserved!

Centre of Excellence in Data Science, AI and Modelling

We're part of the Faculty's new Centre of Excellence for Data Science, Artificial Intelligence and Modelling (DAIM). You can join us a taught MSc student, PhD student, or as a new member of academic staff.

Inspired in Hull Awards 2022

Nina is extremely excited to have received the Inspired in Hull Award for "Excellence in Supervision". What an honour seen the amazing colleagues that were nominated, having a brilliant cohort of PhD students like our BDA bunch is a real advantage!

ICRL Social on Facilitating the Renewables Transition with AI

We're thrilled to be hosting a Social session at ICRL 2022 - featuring a panel discussion with a set of experts from the wind and renewables sector, branching across academia and the industry. We hope to explore the opportunities and risks of AI in the transition towards Net Zero. See our dedicated page here.

Conversation article

Read our latest Conversation article! We discuss the carbon footprint of modern AI and how it contributes to furthering the divide between developed and developing countries.

  • BDA, EEI
  • 28 March - 1 April 2022

Hackathon in AI for Sustainability

Our (first) NERC discipline hopping Hackathon in AI for Sustainability took place w/c 28 March, jointly organised by Computer Science and the Energy and Environment Institute. We spotten hedges from the air, predicted the structural health of wind turbine monopiles, and predicted floods from social media. See our hackathon page here.

Sugergen best poster award!

Congratulations to Bobby Houseago for winning the Early Career Researcher Poster Prize at the Supergen Annual Assembly this year for his poster on rapid fatigue assessment in offshore wind farms. Yay!

BDA Christmas 2021

Thanks all who made it to our annual sparkly Christmas research workshop to present your work, discuss ideas, and eat some cake... Looking forward to next year!

Green Talents Award 2021

We're massively proud of Joyjit who got selected for a Green Talents Award from amongst 25 researchers from around the globe. The German Ministry of Education and Research awards these prestigious titles + a fully funded research stay in Germany to young people that have made an outstanding contribution to research in sustainability.

Hedgehog Rescue

The University are part of an amazing initiative to create a hedgehog-friendly campus, see here. This involves raising awareness of hedgehogs and their habitats, creating safe spaces, including bug hotels and hedgehog shelters, and keeping our campus clean and tidy. We're very excited about this!

Congrats Joyjit!

Congratulations to Dr Joyjit Chatterjee, who has successfully passed his viva, completing his PhD on Explainable AI for O&M of Wind Turbines, see here. Well done Joyjit - and we're excited that you're staying with us in a postdoctoral role.

EPSRC Supergen ORE grant

We won an EPSRC Supergen ORE grant with Agota Mockute at the EEI and Lizzy Cross from Sheffield and a group of strong industry partners. We'll be predicting fatigue levels of wind turbines using deep learning and physical modelling. See details here.

Nudge Sustainable Impact Challenge

Congratulations to Joyjit for completing the Nudge Global Impact Challenge that selects young research leaders to develop their own sustainability research plans. See Joyjit's promotional video pitching his PhD project here.

Congrats Annika!

Congratulations to Dr Annika Schoene, one of our founding members, who has successfully passed her viva and is leaving us to take up a postdoctoral position at the University of Manchester. Keep in touch, Annika!

KTP with Reckitt

We are really excited to be working with Reckitt on a 2-year KTP project to develop novel AI techniques to improve stability testing and new product development, and accelerate product development from design over testing to manufacturing. What an amazing opportunity!

EMNLP-2019 in Hong Kong

Annika went to Hong Kong to present her paper Dilated LSTM with attention for Classification of suicide notes at the LOUHI workshop at EMNLP-2019. See here.

Offshore Wind Hackathon, Glasgow

Joyjit was part of a Uni Hull team to enter the Offshore Wind Hackathon in Glasgow, applying deep learning to predict the power availability in wind farms.

Knowledge Transfer Partnership on AI / NLP

We are looking forward to working with the Spencer Group on a new Innovate UK-funded Knowledge Transfer Partnership (KTP) to integrate artificial intelligence and natural language processing into corporate workflows. Job advert to follow shortly.

MLSS-2019, London

We are proud of Annika who was selected to attend MLSS this year - THE Machine Learning Summer School, which took place in London between 15-26 July 2019. Click here to read more on her blog.

  • Nina
  • 22-25 Jan 2019

Deep Learning Week 2019

We had over a 100 participants at our annual Deep Learning week including 4 brilliant external speakers, see here and here for our programmes. Thanks to everyone who attended, participated and contributed.

Aura Centre for Doctoral Training in Offshore Wind and the Environment

We are part of Hull's new EPSRC and NERC-funded CDT in Offshore Wind and the Environment together with partners Durham, Newcastle and Sheffield. See here for website and PhD student recruitment.

Northern Lights in Tromsø

Joyjit presented some early PhD work at the Northern Lights Deep Learning Workshop in Tromsø - doing some networking and seeing the Northern Lights. See his presentation and talk here.

GPU programming for Deep Learning -- Jan 24/25 2019

We are hosting a 2-day GPU and CUDA programming course right after the Deep Learning Winter School 2019. The aim is to gain an understanding of GPU hardware and a competence in CUDA GPU programming for scientific applications, with a particular focus on deep learning. See here for details and registration.

Christmas Poster Workshop

On 7th December the Big Data Analytics Research group gathered for its annual Poster Workshop and Christmas Lunch.

ORE Catapult Visit in Glasgow

Joyjit spent some time with the ORE Catapult in Glasgow last month. Read about this visit.

  • BDA
  • 26 Oct 2018

Christmas 2018 -- Posters Workshop and Lunch

We're excited about our upcoming Christmas Workshop and Lunch on 7 Dec. See our flyer.

2nd Deep Learning Winter School -- Jan 22/23 2019

We are organising the 2nd Deep Learning Winter School on 22-23 January 2019 as a hands-on introduction to the theory and practice of deep learning, including talks from leading researchers in the field a tour of Viper See here for details and registration.

Aura PhD Cluster - Class of 2018

The new Aura wind energy PhD cluster met for a kick-off meeting with the 4 new students, including Joyjit from the Big Data Analytics group. Read Joyjit's report.

Post-doctoral research assistant - 15 Nov

We're recruiting for a post-doctoral research assistant to work on an EPSRC-funded project on Transparent AI. See here. Deadline 15 Nov 2018.

PhD scholarship - Deep learning for water sensing

We're recruiting a PhD student on a 3-year fully-funded scholarship on data analytics (particularly deep learning) for water sensing. See here. Deadline 23 Jan 2019.

PhD scholarship - Deep learning for flood risk mapping

We're recruiting a PhD student on a 3-year fully-funded scholarship on data analytics from IoT devices for modelling flood risks. See here. Deadline 23 Jan 2019.

Research Internship at IBM

Once the start date and funding for my PhD was confirmed, I started searching the internet for what makes a successful PhD student in Computer Science. ...

Read more

British Science Festival

David Benoit from Chemistry and Nina Dethlefs gave a talk at the British Science Festival on "Let the AI In".

Read more

The Conversation

Following up on our talk at the British Science Festival, we published an article in The Conversation --- AI: there’s a reason it’s so bad at conversation.

EPSRC First Grant

Alex Turner got his EPSRC grant on 'Using Epigenetically-Inspired Connectionist Models to Provide Transparency In The Modelling of Human Visceral Leismaniasis'.

See details.

Welcome and big data cake

Today we had "big data" cake to welcome 3 new team members who will be working on deep learning and computational biology.

Read more

KDD-2018 in London

After having my paper accepted for the 2018 WISDOM workshop at KDD, I felt both nervous but excited. ...

Read more

Get in touch


+44 (0) 1482 465994


Email one of our team


Robert Blackburn Building
3rd floor
University of Hull
Cottingham Road
HU6 7RX, Hull, UK