•We present a systematic review of optical see-through head mounted display (OST-HMD) usage in augmented reality surgery applications from 2013 to 2020.•91 articles that fulfilled all inclusion ...criteria were categorized by OST- HMD device, surgical speciality, surgical application context, visualisation content, experimental design and evaluation, accuracy and human factors of human-computer interaction.•Human Factors emerge as significant to OST-HMD utility.•The significant upward trend in published articles is clear, but such devices are not yet established in the operating room and clinical studies showing benefit are lacking.•A focused effort addressing technical registration and perceptual factors in the lab coupled with design that incorporates human factors considerations to solve clear clinical problems should ensure that the significant current research efforts will succeed.
Display omitted
This article presents a systematic review of optical see-through head mounted display (OST-HMD) usage in augmented reality (AR) surgery applications from 2013 to 2020. Articles were categorised by: OST-HMD device, surgical speciality, surgical application context, visualisation content, experimental design and evaluation, accuracy and human factors of human-computer interaction. 91 articles fulfilled all inclusion criteria. Some clear trends emerge. The Microsoft HoloLens increasingly dominates the field, with orthopaedic surgery being the most popular application (28.6%). By far the most common surgical context is surgical guidance (n=58) and segmented preoperative models dominate visualisation (n=40). Experiments mainly involve phantoms (n=43) or system setup (n=21), with patient case studies ranking third (n=19), reflecting the comparative infancy of the field. Experiments cover issues from registration to perception with very different accuracy results. Human factors emerge as significant to OST-HMD utility. Some factors are addressed by the systems proposed, such as attention shift away from the surgical site and mental mapping of 2D images to 3D patient anatomy. Other persistent human factors remain or are caused by OST-HMD solutions, including ease of use, comfort and spatial perception issues. The significant upward trend in published articles is clear, but such devices are not yet established in the operating room and clinical studies showing benefit are lacking. A focused effort addressing technical registration and perceptual factors in the lab coupled with design that incorporates human factors considerations to solve clear clinical problems should ensure that the significant current research efforts will succeed.
•Full torso porcine CT model for stereo-endoscopic reconstruction validation•CT of endoscope and anatomy with constrained manual alignment provides a reference•Accuracy analysis of repeated ...alignments and performance of existing algorithms presented•Open sourced dataset for stereo reconstruction validation
Display omitted
In computer vision, reference datasets from simulation and real outdoor scenes have been highly successful in promoting algorithmic development in stereo reconstruction. Endoscopic stereo reconstruction for surgical scenes gives rise to specific problems, including the lack of clear corner features, highly specular surface properties and the presence of blood and smoke. These issues present difficulties for both stereo reconstruction itself and also for standardised dataset production. Previous datasets have been produced using computed tomography (CT) or structured light reconstruction on phantom or ex vivo models. We present a stereo-endoscopic reconstruction validation dataset based on cone-beam CT (SERV-CT). Two ex vivo small porcine full torso cadavers were placed within the view of the endoscope with both the endoscope and target anatomy visible in the CT scan. Subsequent orientation of the endoscope was manually aligned to match the stereoscopic view and benchmark disparities, depths and occlusions are calculated. The requirement of a CT scan limited the number of stereo pairs to 8 from each ex vivo sample. For the second sample an RGB surface was acquired to aid alignment of smooth, featureless surfaces. Repeated manual alignments showed an RMS disparity accuracy of around 2 pixels and a depth accuracy of about 2 mm. A simplified reference dataset is provided consisting of endoscope image pairs with corresponding calibration, disparities, depths and occlusions covering the majority of the endoscopic image and a range of tissue types, including smooth specular surfaces, as well as significant variation of depth. We assessed the performance of various stereo algorithms from online available repositories. There is a significant variation between algorithms, highlighting some of the challenges of surgical endoscopic images. The SERV-CT dataset provides an easy to use stereoscopic validation for surgical applications with smooth reference disparities and depths covering the majority of the endoscopic image. This complements existing resources well and we hope will aid the development of surgical endoscopic anatomical reconstruction algorithms.
Display omitted
•Multi-center, multi-vendor, multi-protocol prostate MRI dataset was made available for evaluation of segmentation algorithms.•Evaluated 11 substantially different segmentation ...algorithms with respect to algorithm performance on multi-center data.•Algorithms were evaluated relative to human observers.•Challenge results show that segmentation of prostate MRI images is not a solved issue.
Prostate MRI image segmentation has been an area of intense research due to the increased use of MRI as a modality for the clinical workup of prostate cancer. Segmentation is useful for various tasks, e.g. to accurately localize prostate boundaries for radiotherapy or to initialize multi-modal registration algorithms. In the past, it has been difficult for research groups to evaluate prostate segmentation algorithms on multi-center, multi-vendor and multi-protocol data. Especially because we are dealing with MR images, image appearance, resolution and the presence of artifacts are affected by differences in scanners and/or protocols, which in turn can have a large influence on algorithm accuracy. The Prostate MR Image Segmentation (PROMISE12) challenge was setup to allow a fair and meaningful comparison of segmentation methods on the basis of performance and robustness. In this work we will discuss the initial results of the online PROMISE12 challenge, and the results obtained in the live challenge workshop hosted by the MICCAI2012 conference. In the challenge, 100 prostate MR cases from 4 different centers were included, with differences in scanner manufacturer, field strength and protocol. A total of 11 teams from academic research groups and industry participated. Algorithms showed a wide variety in methods and implementation, including active appearance models, atlas registration and level sets. Evaluation was performed using boundary and volume based metrics which were combined into a single score relating the metrics to human expert performance. The winners of the challenge where the algorithms by teams Imorphics and ScrAutoProstate, with scores of 85.72 and 84.29 overall. Both algorithms where significantly better than all other algorithms in the challenge (p<0.05) and had an efficient implementation with a run time of 8min and 3s per case respectively. Overall, active appearance model based approaches seemed to outperform other approaches like multi-atlas registration, both on accuracy and computation time. Although average algorithm performance was good to excellent and the Imorphics algorithm outperformed the second observer on average, we showed that algorithm combination might lead to further improvement, indicating that optimal performance for prostate segmentation is not yet obtained. All results are available online at http://promise12.grand-challenge.org/.
Obstetric ultrasound (US) training teaches the relationship between foetal anatomy and the viewed US slice to enable navigation to standardised anatomical planes (head, abdomen and femur) where ...diagnostic measurements are taken. This process is difficult to learn, and results in considerable inter-operator variability. We propose the CAL-Tutor system for US training based on a US scanner and phantom, where a model of both the baby and the US slice are displayed to the trainee in its physical location using the HoloLens 2. The intention is that AR guidance will shorten the learning curve for US trainees and improve spatial awareness. In addition to the AR guidance, we also record many data streams to assess user motion and the learning process. The HoloLens 2 provides eye gaze, head and hand position, ARToolkit and NDI Aurora tracking gives the US probe positions and an external camera records the overall scene. These data can provide a rich source for further analysis, such as distinguishing expert from novice motion. We have demonstrated the system in a sample of engineers. Feedback suggests that the system helps novice users navigate the US probe to the standard plane. The data capture is successful and initial data visualisations show that meaningful information about user behaviour can be captured. Initial feedback is encouraging and shows improved user assessment where AR guidance is provided.
Abstract Mixed reality environments for medical applications have been explored and developed over the past three decades in an effort to enhance the clinician's view of anatomy and facilitate the ...performance of minimally invasive procedures. These environments must faithfully represent the real surgical field and require seamless integration of pre- and intra-operative imaging, surgical instrument tracking, and display technology into a common framework centered around and registered to the patient. However, in spite of their reported benefits, few mixed reality environments have been successfully translated into clinical use. Several challenges that contribute to the difficulty in integrating such environments into clinical practice are presented here and discussed in terms of both technical and clinical limitations. This article should raise awareness among both developers and end-users toward facilitating a greater application of such environments in the surgical practice of the future.
3D models produced from medical imaging can be used to plan treatment, design prosthesis, teach and for communication. Despite the clinical benefit, few clinicians have experience of how 3D models ...are produced.This is the first study evaluating a training tool to teach clinicians to produce 3D models and reporting the perceived impact on their clinical practice.
Following ethical approval, 10 clinicians completed a bespoke training tool, comprising written and video material alongside online support. Each clinician and 2 technicians (included as control) were sent 3 CT scans and asked to produce 6 fibula 3D models using an open-source software (3Dslicer). The produced models were compared to those produced by the technicians using Hausdorff distance calculation. Thematic analysis was used to study the post-intervention questionnaire.
The mean Hausdorff distance between the final model produced by the clinicians and technicians was 0.65mm SD0.54mm. The first model made by clinicians took a mean time of 1hr 25mins and the final model took 16:04mins (5:00-46:00mins). 100% of learners reported finding the training tool useful and will employ it in future practice.
The training tool described in this paper is able to successfully train clinicians to produce fibula models from CT scans. Learners were able to produce comparable models to technicians within an acceptable timeframe. This does not replace technicians. However, the learners perceived this training will allow them to use this technology in more cases, with appropriate case selection and they appreciate the limits of this technology.
Robot-assisted surgery has potential advantages but lacks force feedback, which can lead to errors such as broken stitches or tissue damage. More experienced surgeons can judge the tool-tissue forces ...visually and an automated way of capturing this skill is desirable. Methods to measure force tend to involve complex measurement devices or visual tracking of tissue deformation. We investigate whether surgical forces can be estimated simply from the discrepancy between kinematic and visual measurement of the tool position. We show that combined visual and kinematic force estimation can be achieved without external measurements or modelling of tissue deformation. After initial alignment when no force is applied to the tool, visual and kinematic estimates of tool position diverge under force. We plot visual/kinematic displacement with force using vision and marker-based tracking. We demonstrate the ability to discern the forces involved in knot tying and visualize the displacement force using the publicly available JIGSAWS dataset as well as clinical examples of knot tying with the da Vinci surgical system. The ability to visualize or feel forces using this method may offer an advantage to those learning robotic surgery as well as adding to the information available to more experienced surgeons..
Abstract A fundamental challenge in the development of image-guided surgical systems is alignment of the preoperative model to the operative view of the patient. This is achieved by finding ...corresponding structures in the preoperative scans and on the live surgical scene. In robot-assisted laparoscopic prostatectomy (RALP), the most readily visible structure is the bone of the pelvic rim. Magnetic resonance imaging (MRI) is the modality of choice for prostate cancer detection and staging, but extraction of bone from MRI is difficult and very time consuming to achieve manually. We present a robust and fully automated multi-atlas pipeline for bony pelvis segmentation from MRI, using a MRI appearance embedding statistical deformation model (AE-SDM). The statistical deformation model is built using the node positions of deformations obtained from hierarchical registrations of full pelvis CT images. For datasets with corresponding CT and MRI images, we can transform the MRI into CT SDM space. MRI appearance can then be used to improve the combined MRI/CT atlas to MRI registration using SDM constraints. We can use this model to segment the bony pelvis in a new MRI image where there is no CT available. A multi-atlas segmentation algorithm is introduced which incorporates MRI AE-SDMs guidance. We evaluated the method on 19 subjects with corresponding MRI and manually segmented CT datasets by performing a leave-one-out study. Several metrics are used to quantify the overlap between the automatic and manual segmentations. Compared to the manual gold standard segmentations, our robust segmentation method produced an average surface distance 1.24 ± 0.27 mm, which outperforms state-of-the-art algorithms for MRI bony pelvis segmentation. We also show that the resulting surface can be tracked in the endoscopic view in near real time using dense visual tracking methods. Results are presented on a simulation and a real clinical RALP case. Tracking is accurate to 0.13 mm over 700 frames compared to a manually segmented surface. Our method provides a realistic and robust framework for intraoperative alignment of a bony pelvis model from diagnostic quality MRI images to the endoscopic view.
Purpose
Colorectal cancer is the third most common cancer worldwide, and early therapeutic treatment of precancerous tissue during colonoscopy is crucial for better prognosis and can be curative. ...Navigation within the colon and comprehensive inspection of the endoluminal tissue are key to successful colonoscopy but can vary with the skill and experience of the endoscopist. Computer-assisted interventions in colonoscopy can provide better support tools for mapping the colon to ensure complete examination and for automatically detecting abnormal tissue regions.
Methods
We train the conditional generative adversarial network pix2pix, to transform monocular endoscopic images to depth, which can be a building block in a navigational pipeline or be used to measure the size of polyps during colonoscopy. To overcome the lack of labelled training data in endoscopy, we propose to use simulation environments and to additionally train the generator and discriminator of the model on unlabelled real video frames in order to adapt to real colonoscopy environments.
Results
We report promising results on synthetic, phantom and real datasets and show that generative models outperform discriminative models when predicting depth from colonoscopy images, in terms of both accuracy and robustness towards changes in domains.
Conclusions
Training the discriminator and generator of the model on real images, we show that our model performs implicit domain adaptation, which is a key step towards bridging the gap between synthetic and real data. Importantly, we demonstrate the feasibility of training a single model to predict depth from both synthetic and real images without the need for explicit, unsupervised transformer networks mapping between the domains of synthetic and real data.