Hide/Show Full Abstract
Grapheme-to-Phoneme (G2P) correspondences
form foundational frameworks of tasks such
as text-to-speech (TTS) synthesis or automatic
speech recognition. The G2P process involves
taking words in their written form and generating their pronunciation. In this paper, we
critique the status quo definition of a grapheme,
currently a forced alignment process relating
a single character to either a phoneme or a
blank unit, that underlies the majority of modern approaches. We develop a linguisticallymotivated redefinition from simple concepts
such as vowel and consonant count and word
length and offer a proof-of-concept implementation based on a multi-binary neural classification task. Our model achieves competitive
results with a 31.86% Word Error Rate on a
standard benchmark, while generating linguistically meaningful grapheme segmentations.
Proceedings of the 28th Conference on Computational Natural Language Learning (CoNLL)., Miami, USA.
Hide/Show Full Abstract
In the realm of social media discourse, the
integration of slang enriches communication,
reflecting the sociocultural identities of users.
This study investigates the capability of large
language models (LLMs) to paraphrase slang
within climate-related tweets from Nigeria and
the UK, with a focus on identifying emotional
nuances. Using DistilRoBERTa as the baseline model, we observe its limited comprehension of slang. To improve cross-cultural understanding, we gauge the effectiveness of leading LLMs: ChatGPT 4, Gemini, and LLaMA3
in slang paraphrasing. While ChatGPT 4 and
Gemini demonstrate comparable effectiveness
in slang paraphrasing, LLaMA3 shows less coverage, with all LLMs exhibiting limitations in
coverage, especially of Nigerian slang. Our
findings underscore the necessity for culturally-sensitive LLM development in emotion classification, particularly in non-anglocentric regions.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), Miami, USA.
Hide/Show Full Abstract
The Offshore Wind (OSW) industry is experiencing significant expansion, resulting
in increased Operations & Maintenance (O&M) costs. Intelligent alarm systems offer the
prospect of swift detection of component failures and process anomalies, enabling timely and
precise interventions that could yield reductions in resource expenditure, as well as scheduled and
unscheduled downtime. This paper introduces an innovative approach to tackle this challenge
by capitalising on Large Language Models (LLMs). We present a specialised conversational
agent that incorporates statistical techniques to calculate distances between sentences for the
detection and filtering of hallucinations and unsafe output. This potentially enables improved
interpretation of alarm sequences and the generation of safer repair action recommendations
by the agent. Preliminary findings are presented with the approach applied to ChatGPT-4
generated test sentences. The limitation of using ChatGPT-4 and the potential for enhancement
of this agent through re-training with specialised OSW datasets are discussed.
Hide/Show Full Abstract
In the realm of social media discourse, the integration of slang enriches communication, reflecting the sociocultural identities of users. This study investigates the capability of large language models (LLMs) to paraphrase slang within climate-related tweets from Nigeria and the UK, with a focus on identifying emotional nuances. Using DistilRoBERTa as the baseline model, we observe its limited comprehension of slang. To improve cross-cultural understanding, we gauge the effectiveness of leading LLMs ChatGPT 4, Gemini, and LLaMA3 in slang paraphrasing. While ChatGPT 4 and Gemini demonstrate comparable effectiveness in slang paraphrasing, LLaMA3 shows less coverage, with all LLMs exhibiting limitations in coverage, especially of Nigerian slang. Our findings underscore the necessity for culturally-sensitive LLM development in emotion classification, particularly in non-anglocentric regions.
2024 Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Miami, Florida, US.
Hide/Show Full Abstract
Grapheme-to-Phoneme (G2P) correspondences form foundational frameworks of tasks such as text-to-speech (TTS) synthesis or automatic speech recognition. The G2P process involves taking words in their written form and generating their pronunciation. In this paper, we critique the status quo definition of grapheme, currently a forced alignment process relating a single character to either a phoneme or a blank unit, that underlies the majority of modern approaches. We develop a linguistically-motivated redefinition from simple concepts such as vowel and consonant count and word length and offer a proof-of-concept implementation based on a multi-binary neural classification task. Our model achieves state-of-the-art results with a 31.86% Word Error Rate on a standard benchmark, while generating linguistically meaningful grapheme segmentations.
2024 Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL), Miami, Florida, US.
Hide/Show Full Abstract
Manufacturers have increasingly turned to Artificial Intelligence (AI) to address specific problems in factories e.g. predictive maintenance, improving product quality etc. Implementing these in silos can miss critical interdependencies during root cause analysis and the interplay between various data sources across broader manufacturing operations. We provide a perspective on a holistic data model that harmonises data generated during production processes and across the supply chain enabling data-driven decisions to drive productivity, optimisation, and sustainability. Our approach is based on Overall Equipment Effectiveness (OEE) that focuses on predictive maintenance to increase availability, Continued Process Verification (CPV) and predictive stability to improve quality, and real-time insights, thereby optimising performance. The proposed multimodal model can help enhance productivity and reduce waste and rework, lending itself to sustainability imperatives. Similarly, Cost to Serve analysis, which targets inefficiencies in the distribution network and transportation, helps facilitate a reduction in costs and minimisation of carbon footprint in the supply chain.
Hide/Show Full Abstract
As machine learning is increasingly making decisions about hiring or healthcare, we want AI to treat ethnic and socioeconomic groups fairly. Fairness is currently measured by comparing the average accuracy of reasoning across groups. We argue that improved measurement is possible on a continuum and without averaging, with the advantage that nuances could be observed within groups. Through the example of skin cancer diagnosis, we illustrate a new statistical method that works on multidimensional data and treats fairness in a continuum. We outline this new approach and focus on its robustness against three types of adversarial attacks. Indeed, such attacks can influence data in ways that may cause different levels of misdiagnosis for different skin tones, thereby distorting fairness. Our results reveal nuances that would not be evident in a strictly categorical approach.
Hide/Show Full Abstract
Digital Twin (DT) technology has seen an explosion in popularity, with wind energy no exception. This is particularly true for Operations & Maintenance (O&M) applications. However, this expanded use has been accompanied by loose, conflicting, definitions that threaten to reduce the term to a buzzword and prevent the technology from meeting its full potential. A number of attempts have been made to better define and classify DTs, however, these either oversimplify the term or tighten criteria, leading to the exclusion of many DT applications. A new definition framework dubbed the Digital Twin Family Tree is therefore proposed. This widens "Digital Twin" to a general umbrella term for the technology, accompanied by specific definitions. DT Tags are also used to provide individualised characteristics for implementations. A sector-specific definition was devised for component and system monitoring and predictions in wind energy O&M dubbed a CS-DT and suitable DT Tags created. The proposed framework was used to review existing research in literature, demonstrating the potential for increased understanding, explainability, and accessibility of DTs for expert and non-expert stakeholders.
Hide/Show Full Abstract
This paper outlines our multimodal ensemble learning system for identifying persuasion tech- niques in memes. We contribute an approach which utilises the novel inclusion of consistent named visual entities extracted using Google Vision API’s as an external knowledge source, joined to our multimodal ensemble via late fu- sion. As well as detailing our experiments in ensemble combinations, fusion methods and data augmentation, we explore the impact of including external data and summarise post- evaluation improvements to our architecture based on analysis of the task results.
2024 SEMEVAL 2024 Shared Task on "Multilingual Detection of Persuasion Techniques in Memes", at the
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Mexico City, Mexico
Hide/Show Full Abstract
When training models for visual anomaly detection, typically, a dataset is collected and then annotated offline. Even if collecting raw data is relatively cheap, annotations are expensive, especially if they require human expertise. We therefore propose a novel interactive learning framework that combines active learning with natural language interaction to minimise the amount of annotated training data and allow for refined human expert feedback that may be leveraged in the learning pro- cess. In our initial experiments on wind turbine drone images, we demonstrate the effectiveness of active learning for anomaly detection when using ground truth la- bels, and assess the impact on learning when collecting labels from ‘experts’ versus ‘non-experts’ using our dialogue system. In addition to anomaly labels with confi- dence scores, we collect and analyse natural language explanations, which may be used to improve both anomaly detection performance and explainability.
2024 The 14th International Workshop on Spoken Dialogue Systems Technology, Sapporo, Japan
Hide/Show Full Abstract
The zeitgeist of the digital era has been dominated by an expanding integration of Artificial Intelligence (AI) in a plethora of applications across various domains. With this expansion, however, questions of the safety and reliability of these methods come have become more relevant than ever. SafeML is a model-agnostic approach for performing such monitoring, using distance measures based on statistical testing of the training and operational datasets; comparing them to a predetermined threshold, returning a binary value whether the model should be trusted in the context of the observed data or be deemed unreliable. Although a systematic framework exists for this approach, its performance is hindered by: (1) a dependency on a number of design parameters that directly affect the selection of a safety threshold and therefore likely affect its robustness, (2) an inherent assumption of certain distributions for the training and operational sets, as well as (3) a high computational complexity for relatively large sets. This work addresses these limitations by changing the binary decision to a continuous metric. Furthermore, all data distribution assumptions are made obsolete by implementing non-parametric approaches, and the computational speed increased by introducing a new distance measure based on the Empirical Characteristics Functions (ECF).
2024 Future of Information and Communication Conference, Berlin, Germany
Hide/Show Full Abstract
Traditional predictive simulations and remote sensing techniques for forecasting floods are based on
fixed and spatially restricted physics-based models.
These models are computationally expensive and
can take many hours to run, resulting in predictions made based on outdated data. They are also
spatially fixed, and unable to scale to unknown
areas. By modelling the task as an image segmentation problem, an alternative approach using artificial intelligence to approximate the parameters
of a physics-based model in 2D is demonstrated,
enabling rapid predictions to be made in real-time.
2024 Northern Lights Deep Learning Conference, Tromso, Norway
Hide/Show Full Abstract
To understand the global trends of human opinion on climate change in specific geographical areas, this research proposes a framework to analyse linguistic features and cultural differences in climate-related tweets. Our study combines transformer networks with linguistic feature analysis to address small dataset limitations and gain insights into cultural differences in tweets from the UK and Nigeria. Our study found that Nigerians use more leadership language and informal words in discussing climate change on Twitter compared to the UK, as these topics are treated as an issue of salience and urgency. In contrast, the UK’s discourse about climate change on Twitter is characterised by using more formal, logical, and longer words per sentence compared to Nigeria. Also, we confirm the geographical identifiability of tweets through a classification task using DistilBERT, which achieves 83% of accuracy.
2023 Proceedings of the CLASP Conference on Learning with Small Data (LSD), Gothenburg, Sweden
Hide/Show Full Abstract
Wind power is a key pillar in efforts to decarbonise energy production. However, variability in wind speed and resultant wind turbine power generation poses a challenge for power grid integration. Digital Twin (DT) technology provides intelligent service systems, combining real-time monitoring, predictive capabilities and communication technologies. Current DT research for wind turbine power generation has focused on providing wind speed and power generation predictions reliant on Supervisory Control and Data Acquisition (SCADA) sensors, with predictions often limited to the timeframe of datasets. This research looks to expand on this, utilising a novel framework for an intelligent DT system powered by k-Nearest Neighbour (kNN) regression models to upscale live wind speed forecasts to higher wind turbine hub-height and then forecast power generation. As there is no live link to a wind turbine, the framework is referred to as a “Simulated Digital Twin” (SimTwin). 2019-2020 SCADA and wind speed data are used to evaluate this, demonstrating that the method provides suitable predictions. Furthermore, full deployment of the SimTwin framework is demonstrated using live wind speed forecasts. This may prove useful for operators by reducing reliance on SCADA systems and provides a research and development tool where live data is limited.
2023 The 6th International Conference on Renewable Energy and Environment Engineering (REEE 2023)
Hide/Show Full Abstract
Traditional approaches to flood modelling mostly rely on hydrodynamic physical simulations. While these simulations can be accurate, they are computationally expensive and prohibitively so when thinking about real-time prediction based on dynamic environmental conditions.
Alternatively, social media platforms such as Twitter are often used by people to communicate during a flooding event, but discovering which tweets hold useful information is the key challenge in extracting information from posts in real time.
In this article, we present a novel model for flood forecasting and monitoring that makes use of a transformer network that assesses the severity of a flooding situation based on sentiment analysis of the multimodal inputs (text and images). We also present an experimental comparison of a range of state-of-the-art deep learning methods for image processing and natural language processing. Finally, we demonstrate that information induced from tweets can be used effectively to visualise fine-grained geographical flood-related information dynamically and in real-time.
Hide/Show Full Abstract
Wind energy’s ability to liberate the world from conventional sources of energy relies on lowering the significant costs associated with the maintenance of wind turbines. Since icing events on turbine rotor blades are a leading cause of operational failures, identifying icing in advance is critical. Some recent studies have utilized deep learning (DL) techniques to predict icing events with high accuracy by leveraging rotor blade images, but these studies only focus on specific wind parks and fail to generalize to unseen scenarios (e.g., new rotor blade designs). In this paper, we aim to facilitate ice prediction on the face of lack of ice images in new wind parks. We propose the utilization of synthetic data augmentation via a generative artificial intelligence technique—the neural style transfer algorithm to improve the generalization of existing ice prediction models. We also compare the proposed technique with the CycleGAN as a baseline. We show that training standalone DL models with augmented data that captures domain-invariant icing characteristics can help improve predictive performance across multiple wind parks. Through efficient identification of icing, this study can support preventive maintenance of wind energy sources by making them more reliable toward tackling climate change.
2023 Environmental Data Science, Cambridge University Press
Hide/Show Full Abstract
This paper proposes a multi-channel convolutional neural network(MC-CNN) for classifying memes and non-memes. Our architecture is
trained and validated on a challenging dataset that includes non-meme formats with textual attributes, which are also
circulated online but rarely accounted for in meme classification tasks.
Alongside a transfer learning base, two additional channels capture
low-level and fundamental features of memes that make them unique from other images with text. We contribute an approach
which outperforms previous meme classifiers specifically in live data evaluation, and one that is better able to generalise ’in the
wild’. Our research aims to improve accurate collation of meme content to support continued research in meme content analysis,
and meme-related sub-tasks such as harmful content detection.
2023 Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR). Thessaloniki, Greece.
Hide/Show Full Abstract
We explore the recently released ChatGPT model, one of the most powerful conversational AI models that has ever been developed. This opinion provides a perspective on its strengths and weaknesses and a call to action for the AI community (including academic researchers and industry) to work together on preventing potential misuse of such powerful AI models in our everyday lives.
Hide/Show Full Abstract
Wind energy’s ability to liberate the world of conventional sources of energy relies on lowering the significant costs associated with the maintenance of wind turbines. Since icing events on turbine rotor blades are a leading cause of operational failures, identifying icing in advance is critical. Some recent studies focus on specific wind parks and fail to generalize to unseen scenarios (e.g. new rotor blade designs). We propose the utilisation of synthetic data augmentation via neural style transfer to improve the generalization of existing ice prediction models. We show that training models with augmented data that captures domain-invariant icing characteristics can help improve predictive performance across multiple wind parks. Through efficient identification of icing, this study can support preventive maintenance of wind energy sources by making them more reliable towards tackling climate change.
2022 Climate Change AI Workshop, NeurIPS, New Orleans, USA
Hide/Show Full Abstract
With an increasing emphasis on driving down the costs of Operations and Maintenance (O&M) in the Offshore Wind (OSW) sector, comes the requirement to explore new methodology and applications of Deep Learning (DL) to the domain. Condition-based monitoring (CBM) has been at the forefront of recent research developing alarm-based systems and data-driven decision making. This paper provides a brief insight into the research being conducted in this area, with a specific focus on alarm sequence modelling and the associated challenges faced in its implementation. The paper proposes a novel idea to predict a set of relevant repair actions from an input sequence of alarm sequences, comparing Long Short-term Memory (LSTM) and Bidirectional LSTM (biLSTM) models. Achieving training accuracy results of up to 80.23%, and test accuracy results of up to 76.01% with biLSTM gives a strong indication to the potential benefits of the proposed approach that can be furthered in future research. The paper introduces a framework that integrates the proposed approach into O&M procedures and discusses the potential benefits which include the reduction of a confusing plethora of alarms, as well as unnecessary vessel transfers to the turbines for fault diagnosis and correction.
2022 8th International Symposium on Model-Based Safety Assessment, Munich, Germany
Hide/Show Full Abstract
Intelligent question-answering (QA) systems have witnessed increased interest in recent years, particularly in their ability to facilitate information access, data interpretation or decision support. The wind energy sector is one of the most promising sources of renewable energy, yet turbines regularly suffer from failures and operational inconsistencies, leading to downtimes and significant maintenance costs. Addressing these issues requires rapid interpretation of complex and dynamic data patterns under time-critical conditions. In this article, we present a novel approach that leverages interactive, natural language-based decision support for operations & maintenance (O&M) of wind turbines. The proposed interactive QA system allows engineers to pose domain-specific questions in natural language, and provides answers (in natural language) based on the automated retrieval of information on turbine sub-components, their properties and interactions, from a bespoke domain-specific knowledge graph. As data for specific faults is often sparse, we propose the use of paraphrase generation as a way to augment the existing dataset. Our QA system leverages encoder-decoder models to generate Cypher queries to obtain domain-specific facts from the KG database in response to user-posed natural language questions. Experiments with an attention-based sequence-to-sequence (Seq2Seq) model and a transformer show that the transformer accurately predicts up to 89.75% of responses to input questions, outperforming the Seq2Seq model marginally by 0.76%, though being 9.46 times more computationally efficient. The proposed QA system can help support engineers and technicians during O&M to reduce turbine downtime and operational costs, thus improving the reliability of wind energy as a source of renewable energy.
Hide/Show Full Abstract
A rise in ecological anomalous events will be observed due to climate change. One such event is the harmful algal bloom which occurs due to an increase in nutrients from anthropogenic activities and has economic and ecological effects. Algae thrive in warmer temperatures which will lead to an increase in the frequency of harmful algal blooms. To overcome this increasing frequency, early detection tools are essential. Deep learning and frequent monitoring have been used to detect this phenomenon with a focus on unimodal approaches. In this work, we propose using multiple sources of satellite and in-situ data for detecting algal blooms with a joint multimodal learning approach, focusing on the North Sea and the Irish Sea. This work will aid domain experts to monitor potential changes to the ecosystem done by human interference and to take action when necessary.
2022 ECML/PKDD Workshop on Machine Learning for Earth Observation
Hide/Show Full Abstract
Artificial intelligence (AI) can help facilitate wider adoption of renewable energy globally. We organized a social event for the AI and renewables community to discuss these aspects at the International Conference on Learning Representations (ICLR), a leading AI conference. This opinion reflects on the key messages and provides a call for action on leveraging AI for transition toward net zero.
Hide/Show Full Abstract
Several existing resources are available for sentiment analysis (SA) tasks that are used for learning sentiment specific
embedding (SSE) representations. These resources are either large, common-sense knowledge graphs (KG) that cover a limited
amount of polarities/emotions or they are smaller in size, such as lexicons, which require costly human annotation and cover
fine-grained emotions. Therefore using knowledge resources to learn SSE representations is either limited by the low coverage
of polarities/emotions or the overall size of a resource. In this paper, we first introduce a new directed KG called ‘RELATE’,
which is built to overcome both the issue of low coverage of emotions and the issue of scalability. RELATE is the first KG of
its size to cover Ekman’s six basic emotions that are directed towards entities. It is based on linguistic rules to incorporate the
benefit of semantics without relying on costly human annotation. The performance of ‘RELATE’ is evaluated by learning SSE
representations using a Graph Convolutional Neural Network (GCN).
2022 13th Language Resources and Evaluation Conference (LREC).
Hide/Show Full Abstract
As online communication grows, memes have con- tinued to evolve and circulate as succinct multi- modal forms of communication. However, compu- tational approaches applied to meme-related lack the same depth and contextual sensitivity of non- computational approaches and struggle to interpret intra-modal dynamics and referentiality. This re- search proposes to a ‘meme genealogy’ of key fea- tures and relationships between memes to inform a knowledge base constructed from meme-specific online sources and embed connotative meaning and contextual information in memes. The proposed methods provide a basis to train contextually sensi- tive computational models for analysing memes and applications in automated meme annotation.
Hide/Show Full Abstract
Possible sensory failures on monitoring systems re- sult in partially filled data which may lead to erroneous statistical conclusions which may affect critical systems such as pollutant detectors and anomaly activity detectors. Therefore imputation becomes necessary to decrease error. This work addresses the missing data problem by experimenting with various methods in the context of a water quality dataset with high miss rates. Compared models chosen make different assumptions about the data which are Generative Adversarial Networks, Multiple Im- putation by Chained Equations, Variational Auto-Encoders, and Recurrent Neural Networks. A novel recurrent neural network architecture with self-attention is proposed in which imputation is done in a single pass. The proposed model performs with a lower root mean square error, ranging between 0.012-0.28, in three of the four locations. The self-attention components increase the interpretability of the imputation process at each stage of the network, providing information to domain experts.
2022 IEEE International Joint Conference on Neural Networks (IJCNN). Padua, Italy.
Hide/Show Full Abstract
Offshore wind turbine monopiles require structural health monitoring throughout their lifespan, yet direct structural measurements are limited. This paper combines numerical modeling and machine learning to present an approach to obtain rapid estimations of monopile fatigue using hourly metocean conditions. Aero-hydro-servo-elastic numerical simulations for a reference turbine provide the meta-model training dataset that encompasses wind-wave conditions applicable to the North Sea. Analysis reveals conditions whereby higher-order fully non-linear wave kinematics produce larger damage values compared to linear waves. This increase in damage is absent when implementing a simple probabilistic data lumping method. The prototype meta-model is developed based on convolutional neural networks to determine the monopile damage from measured wind-wave conditions at high temporal frequency. The proof-of-concept meta-model provides a step-change that demonstrates a promising approach to estimate monopile fatigue accumulation at high temporal resolution with scope for development to specific real-world offshore wind farms where validation data is available.
2022 European Workshop on Structural Health Monitoring (EWSHM), Palermo, Italy.
Hide/Show Full Abstract
Climate change will affect how water sources are managed and monitored. Continuous monitoring of water quality is crucial to detect pollution, to ensure that various natural cycles are not disrupted by anthropogenic activities and to assess the effec- tiveness of beneficial management measures taken under defined protocols. One such disruption is algal blooms in which population of phytoplank- ton increase rapidly affecting biodiversity in marine environments. The frequency of algal blooms will in- crease with climate change as it presents favourable conditions for reproduction of phytoplankton. Ma- chine learning has been used for early detection of algal blooms previously, with the focus mostly on single closed bodies of water in Far East Asia with short time ranges. In this work, we study four locations around the North Sea and the Irish Sea with different characteristics predicting activity with longer time-spans and explaining the importance of the input for the decision making process with regards to the prediction model. This work aids domain experts to monitor potential changes to the ecosystem done by human interference over longer time ranges and to take action when necessary.
2022 Northern Lights Deep Learning Conference (NLDL).
Hide/Show Full Abstract
Classical measurements and modelling that underpin present flood warning and alert systems are based on fixed and spa- tially restricted static sensor networks. Computationally ex- pensive physics-based simulations are often used that can’t react in real-time to changes in environmental conditions. We want to explore contemporary artificial intelligence (AI) for predicting flood risk in real time by using a diverse range of data sources. By combining heterogeneous data sources, we aim to nowcast rapidly changing flood conditions and gain a greater understanding of urgent humanitarian needs.
Hide/Show Full Abstract
As urban environments manifest high levels of complexity it is of vital importance that safety systems embedded within autonomous vehicles (AVs) are able to accurately anticipate short-term future motion of nearby agents. This problem can be further understood as generating a sequence of coordinates describing the future motion of the tracked agent. Various proposed approaches demonstrate significant benefits of using a rasterised top-down image of the road, with a combination of Convolutional Neural Networks (CNNs), for extraction of relevant features that define the road structure (eg. driveable areas, lanes, walkways). In contrast, this paper explores use of Capsule Networks (CapsNets) in the context of learning a hierarchical representation of sparse semantic layers corresponding to small regions of the High-Definition (HD) map. Each region of the map is dismantled into separate geometrical layers that are extracted with respect to the agent's current position. By using an architecture based on CapsNets the model is able to retain hierarchical relationships between detected features within images whilst also preventing loss of spatial data often caused by the pooling operation. We train and evaluate our model on publicly available dataset nuTonomy scenes and compare it to recently published methods. We show that our model achieves significant improvement over recently published works on deterministic prediction, whilst drastically reducing the overall size of the network.
2021 International Conference on Robotics and Automation (ICRA).
Hide/Show Full Abstract
Current approaches that generate text from linked data for complex real-world domains can face problems including rich and sparse vocabularies as well as learning from examples of long varied sequences. In this article, we propose a novel divide-and-conquer approach that automatically induces a hierarchy of “generation spaces” from a dataset of semantic concepts and texts. Generation spaces are based on a notion of similarity of partial knowledge graphs that represent the domain and feed into a hierarchy of sequence-to-sequence or memory-to-sequence learners for concept-to-text generation. An advantage of our approach is that learning models are exposed to the most relevant examples during training which can avoid bias towards majority samples. We evaluate our approach on two common benchmark datasets and compare our hierarchical approach against a flat learning setup. We also conduct a comparison between sequence-to-sequence and memory-to-sequence learning models. Experiments show that our hierarchical approach overcomes issues of data sparsity and learns robust lexico-syntactic patterns, consistently outperforming flat baselines and previous work by up to 30%. We also find that while memory-to-sequence models can outperform sequence-to-sequence models in some cases, the latter are generally more stable in their performance and represent a safer overall choice.
Hide/Show Full Abstract
Recent statistics in suicide prevention show that people are increasingly posting their last words online and with the
unprecedented availability of textual data from social media platforms researchers have the opportunity to analyse such data.
Furthermore, psychological studies have shown that our state of mind can manifest itself in the linguistic features we use to
communicate. In this paper, we investigate whether it is possible to automatically identify suicide notes from other types of social media
blogs in two document-level classification tasks. The first task aims to identify suicide notes from depressed and blog posts in a
balanced dataset, whilst the second experiment looks at how well suicide notes can be classified when there is a vast amount of
neutral text data, which makes the task more applicable to real-world scenarios. Furthermore we perform a linguistic analysis using
LIWC (Linguistic Inquiry and Word Count). We present a learning model for modelling long sequences in two experiment series. We
achieve an f1-score of 88.26% over the baselines of 0.60 in experiment 1 and 96.1% over the baseline in experiment 2. Finally, we
show through visualisations which features the learning model identifies, these include emotions such as love and personal pronouns.
Hide/Show Full Abstract
Recently, many Deep Learning models have been employed to classify different kinds of plant diseases, but very little work has been done for disease severity detection. However, it is more important to master the severities of plant diseases accurately and timely, as it helps to make effective decisions to protect the plants from being further infected and reduce financial loss. In this paper, based on the Huanglongbing (HLB)-infected leaf images obtained from PlantVillage and crowdAI , we created a dataset with 5,406 citrus leaf images infected by HLB. Then six different kinds of popular models were trained to perform the severity detection of citrus HLB with the goal to find which types of models are more suitable to detect HLB severity with the same training circumstance. The experimental results show that the Inception_v3 model with epochs=60 can achieve higher accuracy than that of other models for severity detection with an accuracy of 74.38% due to its highly computational efficiency and small number of parameters. Additionally, aiming for evaluating whether GANs-based data augmentation can contribute to improve the model learning performance, we adopted DCGANs (Deep Convolutional Generative Adversarial Networks) to augment the original training dataset up to two times itself. Finally, a new training dataset with 14,056 leaf images composed by the original training images and the augmented ones were used to train the Inception_v3 model. As a result, we achieved an accuracy of 92.60%, about 20% higher than that of the Inception_v3 model trained by the original training dataset, which suggested that the GANs-based data augmentation is very useful to improve the model learning performance.
Hide/Show Full Abstract
Artificial gene regulatory networks (AGRNs) are connectionist architectures inspired by biological gene regulation capable of solving tasks within complex dynamical systems. The implementation of an operational
layer inspired by epigenetic mechanisms has been shown to improve the performance of AGRNs, and improve
their transparency by providing a degree of explainability. In this paper, we apply artificial epigenetic layers
(AELs) to two trained deep neural networks (DNNs) in order to gain an understanding of their internal workings, by determining which parts of the network are required at a particular point in time, and which nodes
are not used at all. The AEL consists of artificial epigenetic molecules (AEMs) that dynamically interact with
nodes within the DNNs to allow for the selective deactivation of parts of the network.
2020. Proceedings of the 12th International Joint Conference on Computational Intelligence (IJCCI).
Hide/Show Full Abstract
Mild cognitive impairment (MCI) has been described as the intermediary stage before Alzheimer's Disease - many people however remain stable or even demonstrate improvement in cognition. Early detection of progressive MCI (pMCI) therefore can be utilised in identifying at-risk individuals and directing additional medical treatment in order to revert conversion to AD as well as provide psychosocial support for the person and their family. This paper presents a novel solution in the early detection of pMCI people and classification of AD risk within MCI people. We proposed a model, MudNet, to utilise deep learning in the simultaneous prediction of progressive/stable MCI classes and time-to-AD conversion where high-risk pMCI people see conversion to AD within 24 months and low-risk people greater than 24 months. MudNet is trained and validated using baseline clinical and volumetric MRI data (n = 559 scans) from participants of the Alzheimer's Disease Neuroimaging Initiative (ADNI). The model utilises T1-weighted structural MRIs alongside clinical data which also contains neuropsychological (RAVLT, ADAS-11, ADAS-13, ADASQ4, MMSE) tests as inputs. The averaged results of our model indicate a binary accuracy of 69.8% for conversion predictions and a categorical accuracy of 66.9% for risk classifications.
2020 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT) (pp. 9-18). IEEE.
Hide/Show Full Abstract
Offshore wind farm operators need to make short-term decisions on planning vessel transfers to turbines for preventive or corrective maintenance. These decisions can play a pivotal role in ensuring maintenance actions are carried out in a timely and cost-effective manner. The present optimization of offshore vessel transfer uses mathematical models rather than learning decisions from historical data. In this paper, we design a simulated environment for an offshore wind farm based on Supervisory Control & Acquisition (SCADA) data and alarm logs of historical faults in an operational turbine. Firstly, we utilise a state-of-art decision tree model to predict fault types using SCADA features, and provide explainable decisions. Next, we apply deep reinforcement learning to automatically learn maintenance priorities corresponding to different fault types for ensuring prioritized vessel transfers for critical conditions, and deciding on optimal vessel fleet size. This can lead to significant savings in maintenance costs for the offshore wind industry.
Developments in Renewable Energies Offshore: Proceedings of the 4th International Conference on Renewable Energies Offshore (RENEW 2020, 12-15 October 2020, Lisbon, Portugal).
Hide/Show Full Abstract
Machine learning techniques have been widely used for condition-based monitoring
of wind turbines using Supervisory Control & Acquisition (SCADA) data. However, many
machine learning models, including neural networks, operate as black boxes: despite performing
suitably well as predictive models, they are not able to identify causal associations within
the data. For data-driven system to approach human-level intelligence in generating effective
maintenance strategies, it is integral to discover hidden knowledge in the operational data. In
this paper, we apply deep learning to discover causal relationships between multiple features
(confounders) in SCADA data for faults in various sub-components from an operational turbine
using convolutional neural networks (CNNs) with attention. Our technique overcomes the black
box nature of conventional deep learners and identifies hidden confounders in the data through
the use of temporal causal graphs. We demonstrate the effects of SCADA features on a wind
turbine’s operational status, and show that our technique contributes to explainable AI for wind
energy applications by providing transparent and interpretable decision support.
Hide/Show Full Abstract
The last decade has witnessed an increased interest in applying machine learning techniques to predict faults and anomalies in the operation of wind turbines. These e�orts have lately been dominated by deep learning techniques which, as in other �elds, tend to outperform traditional machine learning algorithms given su�cient amounts of training data. An important shortcoming of deep learning models is their lack of transparency – they operate as black boxes and typically do not provide rationales for their predictions, which can lead to a lack of trust in predicted out- puts. In this article, a novel hybrid model for anomaly prediction in wind farms is proposed, that combines a recurrent neural network approach for accurate classi�cation with an XGBoost deci- sion tree classi�er for transparent outputs. Experiments with an o�shore wind turbine show that our model achieves a classi�cation accuracy of up to 97%. The model is further able to generate detailed feature importance analyses for any detected anomalies, identifying exactly those com- ponents in a wind turbine that contribute to an anomaly. Finally, the feasibility of transfer learning is demonstrated for the wind domain by porting our “o�shore" model to an unseen dataset from an onshore wind farm. The latter model achieves an accuracy of 65% and is able to detect 85% of anomalies in the unseen domain. These results are encouraging for application to wind farms for which no training data is available, e.g. because they have not been in operation for long.
Hide/Show Full Abstract
Wind energy is one of the fastest-growing sustainable energy sources in the world but relies crucially on efficient and effective operations and maintenance to generate sufficient amounts of energy and reduce downtime of wind turbines and associated costs. Machine learning has been applied to fault prediction in wind turbines, but these predictions have not been supported with suggestions on how to avert and fix faults. We present a data-to-text generation system utilising transformers for generating corrective maintenance strategies for faults using SCADA data capturing the operational status of turbines. We achieve this in two stages: a first stage identifies faults based on SCADA input features and their relevance. A second stage performs content selection for the language generation task and creates maintenance strategies based on phrase-based natural language templates. Experiments show that our dual transformer model achieves an accuracy of up to 96.75% for alarm prediction and up to 75.35% for its choice of maintenance strategies during content-selection. A qualitative analysis shows that our generated maintenance strategies are promising. We make our human- authored maintenance templates publicly available, and include a brief video explaining our approach.
2020 International Joint Conference on Neural Networks (IJCNN).
Hide/Show Full Abstract
The global pursuit towards sustainable development is leading to increased adaptation of renewable energy sources. Wind turbines are
promising sources of clean energy, but regularly suffer from failures
and down-times, primarily due to the complex environments and
unpredictable conditions wherein they are deployed. While various
studies have earlier utilised machine learning techniques for fault
prediction in turbines, their black-box nature hampers explainability and trust in decision making. We propose the application of
causal reasoning in operations & maintenance of wind turbines using Supervisory Control & Acquisition (SCADA) data, and harness
attention-based convolutional neural networks (CNNs) to identify
hidden associations between different parameters contributing to
failures in the form of temporal causal graphs. By interpreting these
non-obvious relationships (many of which may have potentially
been disregarded as noise), engineers can plan ahead for unforeseen
failures, helping make wind power sources more reliable.
Fragile Earth Workshop, KDD, August 2020, San Diego, CA
Hide/Show Full Abstract
This paper states the challenges in fine-grained target-
dependent Sentiment Analysis for social media data using recurrent neural networks. Firstly, we outline the problem statement and give a brief overview of related work in the area. Then we outline progress and results achieved to date, a brief
research plan and future directions of this work.
To appear. In AAAI-2020 Doctoral Consortium. New York, USA.
Hide/Show Full Abstract
We propose a novel approach for fine-grained emotion classification in tweets using a Bidirectional Dilated LSTM (BiDLSTM) with attention. Conventional LSTM architectures can face problems when classifying long sequences, which is problematic for tweets, where
crucial information is often attached to the end of a sequence, e.g. an emoticon. We show that by adding a bidirectional layer, dilations and attention mechanism to a standard LSTM, our model overcomes these problems and is able to maintain complex data
dependencies over time. We present experiments with two datasets,
the 2018 WASSA Implicit Emotions Shared Task and a new dataset
of 240,000 tweets. Our BiDLSTM with attention achieves a test
accuracy of up to 81.97% outperforming competitive baselines by
up to 10.52% on both datasets. Finally, we evaluate our data against
a human benchmark on the same task.
To appear. In Proceedings of AAAI-2020 Workshop on Affective Content Analysis. New York, USA
Hide/Show Full Abstract
Wind turbines suffer from operational inconsistencies due to a variety of factors, ranging from environmental changes, to intrinsic anomalies in specific components, such as gearbox, generator, pitch system etc. Condition monitoring of wind turbines has been a critical research area in the last decade, wherein the Supervisory Control & Data Acquisition (SCADA) data is used to analyse the operational behaviour of the turbine and predict any incipient faults to prevent catastrophic losses caused by unexpected failures. Machine learning models have formed a large part of the data-analytics based methods used for learning from historical failures through supervised learning, but they suffer from the lack of ability to provide additional capabilities for learning with little labelled data, or for that matter, no labelled faults in a different domain. Deep learning has shown immense success in areas where time-series data is to be modelled. In this paper, we propose a hybrid deep learning model combining a Long short-term memory network (LSTM) with XGBoost, a decision tree-based classifier for providing the benefits of accuracy through deep learning, and transparency through traditional decision trees. Our study shows that Transfer learning allows us to make predictions with increasing accuracy on unseen data; which is useful for simulations of new operations, new wind farms or other cases of non-available training data. This can help reduce downtime of turbines through predictive maintenance, by predicting incipient faults, or provide corrective maintenance, by assisting the engineers and technicians to analyse the root causes behind the failure, thus contributing to the reliability and uptake of wind energy as a sustainable and promising domain.
2019. In WindEurope Offshore, Copenhagen, Denmark.
Hide/Show Full Abstract
Wind energy is one of the fastest-growing sustainable energy sources in the world
but relies crucially on efficient and effective operations and maintenance to generate
sufficient amounts of energy and reduce downtime of wind turbines and associated
costs. Machine learning has been applied to fault prediction in wind turbines,
but these predictions have not been supported with suggestions on how to avert
and fix faults. We present a data-to-text generation system using transformers to
produce event descriptions from SCADA data capturing the operational status of
turbines and proposing maintenance strategies. Experiments show that our model
learns feature representations that correspond to expert judgements. In making a
contribution to the reliability of wind energy, we hope to encourage organisations
to switch to sustainable energy sources and help combat climate change.
2019. In NeurIPS 2019 Workshop on Tackling Climate Change with Machine Learning. Vancouver, Canada.
Hide/Show Full Abstract
This paper states the challenges in fine-grained target-
dependent Sentiment Analysis for social media data using recurrent neural networks. Firstly, we outline the problem statement and give a brief overview of related work in the area. Then we outline progress and results achieved to date, a brief
research plan and future directions of this work.
To appear. In AAAI-2020 Doctoral Consortium. New York, USA.
Hide/Show Full Abstract
We propose a novel approach for fine-grained emotion classification in tweets using a Bidirectional Dilated LSTM (BiDLSTM) with attention. Conventional LSTM architectures can face problems when classifying long sequences, which is problematic for tweets, where
crucial information is often attached to the end of a sequence, e.g. an emoticon. We show that by adding a bidirectional layer, dilations and attention mechanism to a standard LSTM, our model overcomes these problems and is able to maintain complex data
dependencies over time. We present experiments with two datasets,
the 2018 WASSA Implicit Emotions Shared Task and a new dataset
of 240,000 tweets. Our BiDLSTM with attention achieves a test
accuracy of up to 81.97% outperforming competitive baselines by
up to 10.52% on both datasets. Finally, we evaluate our data against
a human benchmark on the same task.
To appear. In Proceedings of AAAI-2020 Workshop on Affective Content Analysis. New York, USA.
Hide/Show Full Abstract
Recent statistics in suicide prevention show that people are increasingly posting
their last words online and with the unprecedented availability of textual data
from social media platforms researchers have the opportunity to analyse such data.
Furthermore, psychological studies have shown that our state of mind can manifest
itself in the linguistic features we use to communicate. In this paper, we investigate
whether it is possible to automatically identify suicide notes from other types of
social media blogs in a document-level classification task. Also, we present a
learning model for modelling long sequences, achieving an f1-score of 0.84 over
the baselines of 0.53 and 0.80 (best competing model). Finally, we also show
through visualisations which features the learning model identifies.
2019. In AI for Social Good workshop at NeurIPS (2019), Vancouver, Canada.
Hide/Show Full Abstract
In this paper we present a dilated LSTM with
attention mechanism for document-level classification of suicide notes, last statements and
depressed notes. We achieve an accuracy of
87.34% compared to competitive baselines of
80.35% (Logistic Model Tree) and 82.27%
(Bi-directional LSTM with Attention). Furthermore, we provide an analysis of both the
grammatical and thematic content of suicide
notes, last statements and depressed notes. We
find that the use of personal pronouns, cognitive processes and references to loved ones are
most important. Finally, we show through visualisations of attention weights that the Dilated LSTM with attention is able to identify
the same distinguishing features across documents as the linguistic analysis.
2019. In Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019) at EMNLP. Hong Kong.
Hide/Show Full Abstract
With the rising costs of conventional sources of en- ergy, the world is moving towards sustainable energy sources including wind energy. Wind turbines consist of several electrical and mechanical components and experience an enormous amount of irregular loads, making their operational behaviour at times inconsis- tent. Operations and Maintenance (O&M) is a key factor in monitoring such inconsistent behaviour of the turbines in order to predict and prevent any in- cipient faults which may occur in the near future.
2019. Extended Abstract in Northern Lights Deep Learning Workshop (NLDL), Tromso, Norway.
Hide/Show Full Abstract
With the greater availability of linguistic data from public social media platforms and the advancements of natural language processing, a number of opportunities have arisen for researchers to analyse this type of data. Research efforts have mostly focused on detecting the polarity of textual data, evaluating whether there is positive, negative or sometimes neutral content. Especially the use of neural networks has recently yielded significant results in polarity detection experiments. In this paper we present a more fine-grained approach to detecting sentiment in textual data, particularly analysing a corpus of suicide notes, depressive notes and love notes. We achieve a classification accuracy of 71.76% when classifying based on text and sentiment features, and an accuracy of 69.41% when using the words present in the notes alone. We discover that while emotions in all three datasets overlap, each of them has a unique ‘emotion profile’ which allows us to draw conclusions about the potential mental state that is reflects. Using the emotion sequences only, we achieve an accuracy of 75.29%. The results from unannotated data, while worse than the other models, nevertheless represent an encouraging step towards being able to flag potentially harmful social media posts online and in real time. We provide a high-level corpus analysis of the data sets in order to demonstrate the grammatical and emotional differences.
2018. In Proceedings of the 7th KDD Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM), co-located
with the Knowledge Discovery and Data Mining (KDD), London, UK.
Hide/Show Full Abstract
Stochastic natural language generation systems that are trained from labelled datasets are often domain-specific in their annotation and in their mapping from semantic input representations to lexical-syntactic outputs. As a result, learnt models fail to generalize across domains, heavily restricting their usability beyond single applications. In this article, we focus on the problem of domain adaptation for natural language generation. We show how linguistic knowledge from a source domain, for which labelled data is available, can be adapted to a target domain by reusing training data across domains. As a key to this, we propose to employ abstract meaning representations as a common semantic representation across domains. We model natural language generation as a long short- term memory recurrent neural network encoder-decoder, in which one recurrent neural network learns a latent representation of a semantic input, and a second recurrent neural network learns to decode it to a sequence of words. We show that the learnt representations can be transferred across domains and can be leveraged effectively to improve training on new unseen domains. Experiments in three different domains and with six datasets demonstrate that the lexical-syntactic constructions learnt in one domain can be transferred to new domains and achieve up to 75-100% of the performance of in-domain training. This is based on objective metrics such as BLEU and semantic error rate and a subjective human rating study. Training a policy from prior knowledge from a different domain is consistently better than pure in-domain training by up to 10%.
2017. IEEE Computational Intelligence Magazine: Special Issue on Natural Language Generation with Computational Intelligence.
Hide/Show Full Abstract
This paper describes how the recurrent connectionist architecture epiNet, which is capable of dynamically modifying its topology, is able to provide a form of transparent execution. EpiNet, which is inspired by eukaryotic gene regulation in nature, is able to break its own architecture down into sets of smaller interacting networks. This allows for autonomous complex task decomposition, and by analysing these smaller interacting networks, it is possible to provide a real world understanding of why specific decisions have been made. We expect this work to be useful in fields where the risk of improper decision making is high, such as medical simulations, diagnostics and financial modelling. To test this hypothesis we apply epiNet to two data sets within UCI’s machine learning repository, each of which requires a specific set of behaviours to solve. We then perform analysis on the overall functionality of epiNet in order to deduce the underlying rules behind its functionality and in turn provide transparency of execution.
2017. In Proceedings of the European Conference on Artificial Life (ECAL), Lyon, France.
Hide/Show Full Abstract
Deep learning has recently been adopted for the task of natural language generation (NLG) and shown remarkable results. However, learning can go awry when the input dataset is too small or not well balanced with regards to the examples it contains for various input sequences. This is relevant to naturally occurring datasets such as many that were not prepared for the task of natural language processing but scraped off the web and originally prepared for a different purpose. As a mitigation to the problem of unbalanced training data, we therefore propose to decompose a large natural language dataset into several subsets that “talk about” the same thing. We show that the decomposition helps to focus each learner’s attention during training. Results from a proof-of-concept study show 73% times faster learning over a flat model and better results.
2017. In Proceedings of Language, Data and Knowledge (LDK), Galway, Ireland. Proceedings in: Springer Lecture Notes
in Computer Science (LNCS).
Hide/Show Full Abstract
Recent years have seen a surge of interest in deep learning models that outperform other machine learning algorithms on benchmarks across many disciplines. Most existing deep learning libraries facilitate the development of neural nets by providing a mathematical framework that helps users implement their models more efficiently. This still represents a substantial investment of time and effort, however, when the intention is to compare a range of competing models quickly for a specific task. We present DEFIne, a fluent interface DSL for the specification, optimisation and evaluation of deep learning models. The fluent interface is implemented through method chaining. DEFIne is embedded in Python and is build on top of its most popular deep learning libraries, Keras and Theano. It extends these with common operations for data pre-processing and representation as well as visualisation of datasets and results. We test our framework on three benchmark tasks from different domains: heart disease diagnosis, hand-written digit recognition and weather forecast generation. Results in terms of accuracy, runtime and lines of code show that our DSL achieves equivalent accuracy and runtime to state-of-the-art models, while requiring only about 10 lines of code per application.
2017. In Proceedings of the 2nd International Workshop on Real World Domain Specific Languages (RWDSL), co-located
with the International Symposium on Code Generation and Optimisation (CGO’17). Austin, Texas. In: ACM Digital Library,
International Conference Proceedings Series (ICPS).
Hide/Show Full Abstract
This paper describes the artificial epigenetic network, a recurrent connectionist architecture that is able to dynamically modify its topology in order to automatically decompose and solve dynamical problems. The approach is motivated by the behavior of gene regulatory networks, particularly the epigenetic process of chromatin remodeling that leads to topological change and which underlies the differentiation of cells within complex biological organisms. We expected this approach to be useful in situations where there is a need to switch between different dynamical behaviors, and do so in a sensitive and robust manner in the absence of a priori information about problem structure. This hypothesis was tested using a series of dynamical control tasks, each requiring solutions that could express different dynamical behaviors at different stages within the task. In each case, the addition of topological self-modification was shown to improve the performance and robustness of controllers. We believe this is due to the ability of topological changes to stabilize attractors, promoting stability within a dynamical regime while allowing rapid switching between different regimes. Post hoc analysis of the controllers also demonstrated how the partitioning of the networks could provide new insights into problem structure.
2016. IEEE Transactions on neural networks and learning systems.