Efficient inference offloading for mixture-of-experts large language models in internet of medical things

Efficient inference offloading for mixture-of-experts large language models in internet of medical things

Despite recent significant advancements in large language models (LLMs) for medical services, the deployment difficulties of LLMs in e-healthcare hinder complex medical applications in the Internet of Medical Things (IoMT). People are increasingly concerned about e-healthcare risks and privacy prote...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yuan, Xiaoming, Kong, Weixuan, Luo, Zhenyu, Xu, Minrui
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science Large language models Efficient inference offloading
Online Access:	https://hdl.handle.net/10356/179743
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Similar Items

A game-based incentive-driven offloading framework for dispersed computing
by: Wu, Hongjia, et al.
Published: (2023)

Computation offloading and content caching and delivery in Vehicular Edge Network: a survey
by: Dziyauddin, Rudzidatul Akmam, et al.
Published: (2022)

Distributed algorithm for computation offloading in mobile edge computing considering user mobility and task randomness
by: Zheng, F. Yifeng, et al.
Published: (2022)

DYNAMIC NEURAL ARCHITECTURES FOR IMPROVED INFERENCE
by: CAI SHAOFENG
Published: (2021)

Visible light based occupancy inference using ensemble learning
by: Hao, Jie, et al.
Published: (2018)

Transductive inference using multiple experts for brushwork annotation in paintings domain
by: Yelizaveta, M., et al.
Published: (2013)

Mixtures of experts for understanding model discrepancy in dynamic computer models
by: Nott, D.J., et al.
Published: (2014)

Digital twin-assisted edge computation offloading in industrial internet of things with NOMA
by: Zhang, Long, et al.
Published: (2023)

Proactive Handling of Flight Overbooking: How to Reduce Negative eWOM and the Costs of Bumping Customers
by: Nazifi, Amin, et al.
Published: (2022)

Inductive inference of languages from samplings
by: Jain, S., et al.
Published: (2013)

Parameter optimization for FPSO design using an improved FOA and IFOA-BP neural network
by: Wu, Lei, et al.
Published: (2021)

Reinforcement learning based online request scheduling framework for workload-adaptive edge deep learning inference
by: TAN, Xinrui, et al.
Published: (2024)

Mathematical theory of truth-valued flow inference
by: Wang, P.Z., et al.
Published: (2014)

Justifying the norms of inductive inference
by: Vassend, Olav B.
Published: (2022)

Virtual reality in metaverse over wireless networks with user-centered deep reinforcement learning
by: Yu, Wenhan, et al.
Published: (2023)

Functional inference in semiparametric models using the piggyback bootstrap
by: Dixon, J.R., et al.
Published: (2014)

The language of insults: a look at theme, rheme and negative inferences
by: Leong, Alvin Ping
Published: (2023)

APPROXIMATE INFERENCE FOR COMPLEX MODELS
by: YU XUEJUN
Published: (2022)

A verisimilitude framework for inductive inference, with an application to phylogenetics
by: Vassend, Olav B.
Published: (2022)

Cooperative inference and learning for internet-of-things with limited resources
by: Wang, Yuan
Published: (2019)

LANGUAGE LEARNING OF INDUCTIVE INFERENCE MACHINES WITH MEMORY LIMITATION
by: MA JUNQI
Published: (2018)

Elderly medication adherence with the internet of things
by: TOH, Xiaoping, et al.
Published: (2016)

Strong monotonic and set-driven inductive inference
by: Jain, S.
Published: (2014)

SPATIOTEMPORAL NETWORK INFERENCE OF DENGUE INFECTION CASCADES
by: YEO ZHEN YUAN
Published: (2020)

On some open problems in reflective inductive inference
by: Jain, S.
Published: (2013)

From verification to specification inference
by: Chin, W.-N., et al.
Published: (2013)

Mitotic classes in inductive inference
by: Jain, S., et al.
Published: (2013)

On the non-existence of maximal inference degrees for language identification
by: Jain, S., et al.
Published: (2014)

The inference-boundary model: Reinterpreting theme and rheme
by: Ping, A.L.
Published: (2011)

Fast scene labeling via structural inference
by: ZHANG, Huaidong, et al.
Published: (2021)

Optimizing differential expression analysis for proteomics data via high-performing rules and ensemble inference
by: Peng, Hui, et al.
Published: (2024)

Region inference for an object-oriented language
by: Chin, W.-N., et al.
Published: (2013)

Region inference for an object-oriented language
by: Chin, W.-N., et al.
Published: (2013)

Fear Goliath or David? Inferring competence from demeanor across cultures
by: Lee, Albert, et al.
Published: (2020)

Elderly Medication Adherence Monitoring with the Internet of Things
by: TOH, Xiaoping, et al.
Published: (2016)

Internet-of-things smart-packs for medication adherence
by: Ching, Ming Yang
Published: (2023)

A genetic algorithm for finite state machine inference
by: Nattee Niparnan
Published: (2009)

Reconsidering the role of inference to the best explanation in the epistemology of testimony
by: Gelfert, A.
Published: (2014)

Counting via LED sensing : inferring occupancy using lighting infrastructure
by: Yang, Yanbing, et al.
Published: (2019)

Lightweight privacy preservation techniques for deep learning and inference in Internet of Things
by: Jiang, Linshan
Published: (2022)