Research in Flanders - Thematic Paper - Big Data by Research in Flanders

Thematic Paper

Big Data

For this thematic paper we talked to: Tom Ameloot, Postdoctoral Fellow at Databases and Theoretical Computer Science Research Group (DTCSRG), Hasselt university (photo: Tom Amelootâ&#x20AC;&#x2122;s Research Team)

Bart Vanhaelewyn, Media Consumption Analyst at iMinds MICT (Ghent University) en iMinds Media

Jabran Bhatti, Project Leader at Televic

Katrijn Vannerum, Project Manager at Big N2N (Ghent university) and Systems Biologist at the Flemish Life Sciences Research Institute (VIB)

Johannes Cottyn, Assistant Professor in Automation (Ghent University) and Project Leader at XiaK (Centre of eXcellence in Industrial Automation Kortrijk) (Ghent University and Howest, University College West Flanders)

Wilfried Verachtert, Manager of the Exascience Life Lab and High Performance Computing Project Manager at Imec, University of Leuven

Thomas Crombez, Researcher and Lecturer in Theatre History, the Royal Academy of Fine Arts, Artesis Plantijn University College Antwerp

Thematic papers

The goal of the thematic papers is to present Flemish scientific research internationally. They

Jan Dewilde, Librarian at the Royal Conservatoire of Antwerp, Artesis Plantijn University College Antwerp

focus on fundamental and applied research. The thematic papers are published by Research in Flanders, a project run by Flanders Knowledge Area. The project Research in Flanders is funded by the

Rudy Gevaert, System Developer, Ghent University

Flemish Government, Department of Foreign Affairs. Flanders Knowledge Area supports, through different projects, the internationalization of higher education in Flanders, Belgium.

Yves Moreau, Professor in Bioinformatics at the University of Leuven and researcher at the iMinds Future Health Department, University of Leuven

www.researchinflanders.be www.flandersknowledgearea.be -2-

Big Data

Life in a digitised world Back in September 2014 the internet reached a historic milestone: in that month the worldwide web crossed the proverbial frontier of one billion websites. The www only needed 23 years to grow from 1 single and separate web page, back in August 1991, to a whopping, 1,000,000,000 websites in 2014. Although the internet is still growing steadily as we speak, it is implausible that trend will still continue for years to come, as individual websites are increasingly losing ground to larger web portals like Google, Facebook and Twitter - who do everything in their power to keep us on their pages. But the sheer quantity of data buzzing around the electronic highway keeps on exploding - to mythical proportions at that. Experts now assume that every 10 seconds round about 5 billion gigabytes of data is added to the web... All these huge figures mean that the mountain of useful digital data - called big data, the buzz word of recent years - is growing every second. This is primarily down to three factors: first up there is old analogue data, often kept on tangible and fragile data carriers like paper, CDs, floppy disks and 35mm film. By now a large part of it has been digitised so it is easier to preserve and access. Secondly, science is at such a high level in these modern times that it can truly measure anything there is to measure think for example of reading someoneâ&#x20AC;&#x2122;s DNA or detecting elementary particles in a particle accelerator. Thirdly and lastly the fact that our lives play out more and more in cyberspace - Google is now able to predict when we need new shoes, for instance - has turned the information about our digital behaviour all into big data as well. The enormous amount of data thatâ&#x20AC;&#x2122;s available, then, offers scores of possibilities for science we donâ&#x20AC;&#x2122;t even know about yet, while at the same time offering a glimpse of ultra-useful, cost- and even lifesaving applications - health apps that monitor your physical condition or your eating and drinking habits, for example. On top of all this, big data is much more of an alfa than an omega when it comes to scientific research, because by putting databases next to each other with the help of powerful computers or by joining them together, scientists can identify correlations that may ultimately lead to new questions. Which can then be taken up by (other) scientists. Flanders is at the centre of basic research that works with big data as its starting point, and at the same time capitalises on that digital data with new applications. And besides this, every association or institute with an archive worth the name is involved in one digitisation project or other. It is these three aspects of data management we will be addressing in this dossier. -3-

ÂŠ ConsErfgoed

Digitisation -4-

An old-fashioned archive may be maintained, ventilated and guarded ever so well, it is never completely impossible that an unexpected disaster destroys a large portion of the analogue information inside, stored on paper, photographs, film or in objects (paintings, for instance). So it is an absolute must to digitise all that analogue stuff. This belief has moved every association and institute with an archive worth the name to set up digitisation projects all over the place. Besides, digitised archives have the undeniable advantage that they are much easier to consult and search - even from across the world!

Oral tradition But how do we digitise information we didn’t even have on physical data carriers in the first place, even in the analogue era? The information people tell each other? Tricky though this may seem, it is exactly the challenge of the Belgium is Happening project, set up by the Royal Academy of Fine Arts in Antwerp, which is part of the city’s Artesis Plantijn University College. The aim of the project is to put together an online archive with as much data as possible about performance culture in Belgium i.e. performing arts, performance art and happenings. For Belgium is Happening students of AP University College Antwerp have set out documenting and

visualising as many events in post-war performance culture as possible. ‘We’re not only trying to put all of these events into a neat chronological order,’ explains Thomas Crombez, researcher and lecturer in Theatre History at the Royal Academy of Fine Arts (Artesis Plantijn University College Antwerp). ‘By transcribing the interviews we have conducted, we want to contribute to the written history of a unique and major part of post-war art, literature and theatre history.’ The website where the archive can be freely consulted does not only serve as an environment for users to obtain documentation, but also as a collaboration platform. ‘Students and lecturers can add information and correct each other,’ explains Crombez further. ‘The amount of digitised material (there are about 2,300 events online at the moment) also offers new possibilities in terms of consultation, searchability and visualisation. We invite our students to present their material on timelines and network diagrams.’

Flanders’ largest music library Another department of Artesis Plantijn University College Antwerp, the Royal Conservatoire of Antwerp, is also working on its own digitisation project, ConsErfgoed, which will digitise the library of the conservatoire and then put it online. ‘Our library has about 600,000 volumes, mainly musical scores,’ says the Conservatoire’s librarian Jan Dewilde. ‘This means our library actually has the largest collection of music in Flanders. The oldest work we have is a Gregorian manuscript dating from as far back as the 13th century, while our most recent pieces were composed yesterday, so to speak.’ The library of the Conservatoire is in the first place a private one for the university college, but thanks to its remarkably rich collection of historic music, it became the very first recognised heritage library in Flanders. ‘It also serves as a public music library where people from outside can come and consult our collection and borrow items,’ says Dewilde.

More information Belgium is Happening, Artesis Plantijn University College Antwerp: www.belgiumishappening.be Library digitisation project, Royal Conservatoire Antwerp: www.libraryconservatoryantwerp.be/erfgoed/ -5-

ÂŠ Big N2N

Big Data -6-

The development of next-

‘Systems biology tackles the

generation sequencing technology, modern molecular way of doing medicine in an integrative way by which allows organisms’ genetic and other molecular information

avoiding the focus on the effect

to be read in no time, has provided of a single mutation on a single protein on a single symptom of a science with great possibilities,

at sequencing 10,000 to 100,000 genomes. ‘Analysing this data is computationally intensive,’ says Moreau. ‘It leads to severe bottlenecks in terms of computing power, data storage and data

but also poses major challenges.

single disease,’ says Yves Moreau,

transfer bandwidth. Sequencing

The ever shorter time needed to,

professor in Bioinformatics at

technology has truly led to a data

e.g., analyse large pieces of DNA

the University of Leuven and

explosion.’

or groups of proteins, and the

researcher at the iMinds Future

decreasing cost of these analyses have produced a real tsunami of molecular data. It’s all very interesting no doubt - as it tells us something about the traits we were born with and how much risk we have of developing certain illnesses - but the search for meaningful information in the

Health Department. ‘Instead we collect comprehensive data across the entire genome to create a systemic view of diseases. This approach has been made possible by new technology that allows us, for example, to sequence a significant fraction of or even entire genome, or to measure the

gigantic heap data sequencers spew activity of all genes in a given out is turning more and more into pathological state, and so on. looking for a needle in a haystack.

These new technologies have

transformed biology from a datapoor and labour-intensive science

Single-cell genomics

into a highly automated big-data science.’

SymBioSys help tackle these issues by developing new algorithms to manage all this data. They have been pioneering single-cell genomics. Moreau elaborates, ‘This is a new frontier in molecular biology where instead of analysing the genome of a patient or malignancy, we focus on genomic differences between individual cells. For example, in cancer, initial treatment will often wipe out the vast majority of malignant cells, but some drug-resistant

Differently put: the fact that

cancer cells might survive the

molecular biology has also entered the big data era does not

Moreau and his colleagues at

treatment, start growing again The Human Genome Project,

necessarily mean that from now on completed in 2000, sequenced a everything will take care of itself. human genome at a cost of about

and lead to relapse. How are these cells different at the level of their genome and how does this

Researchers of the SymBioSys

€ 3 billion. By 2007, the genome

explain which cells are sensitive

centre at the University of Leuven

of James Watson, co-discoverer

to treatment and which ones are

have taken up the challenge. They

of the double helix structure of

resistant? How can we improve

are using molecular big data for

DNA, was sequenced at a cost of

treatment to make sure all tumour

so-called systems biology, trying

less than 1 million. Today, large

cells are wiped out? Analysis of

to specifically understand how

centres can sequence individual

the genome sequence of individual

mutations present from birth can

genomes for about € 1,000. While

cells can help pinpoint relevant

develop into genetic conditions or

the data of an individual genome

mutations, but this means that in

how mutations occurring during

can fit onto one DVD, large

the future cancer therapy will not

our lifetime cause cancer.

genome projects nowadays aim

only require sequencing of the

-7-

genome of the tumour, but rather of hundreds of individual tumour cells. This in turn will lead to more data explosion with unprecedented computational requirements.’

Computational power The computing methods Moreau and his colleagues are using to try and get to the bottom of the big data mountain are based on the processing power of computers. Research centre Imec in Leuven is working hard on ways to speed up the interpretation of sequenced data and even automate it wherever possible. Imec houses the Exascience Life Lab, which uses the services of a few supercomputers alongside those of human researchers. At the lab they are trying to make DNA analysis a lot more efficient by changing the way computers process their data. ‘Now, whenever computers analyse DNA samples, they always need to compare them to a piece of reference DNA,’ says Wilfried Verachtert, manager of the Exascience Life Lab and High Performance Computing project manager at Imec. ‘This amounts to an immense jigsaw puzzle that only the brute force of a computer can solve by matching individual pieces of DNA alternately and resorting the letters of the DNA code. The process goes on until all the DNA in the sample has been sequenced.’

Bottleneck

co-operation gave birth to a new tool very recently too: eIPrep, which All this is done by several software combines all the steps needed to get programs that one by one start sequenced data ready for the last working with the results from the part of DNA analysis, the search sequencers. ‘Until very recently for similarities and differences with they were still fast enough to keep other genomes. Verachtert explains, up with the speed of the sequencer, ‘With eIPrep we only need to run but nowadays the technology in this the original data through the whole area is changing so fast - quicker software carrousel once and the than chip technology even - that results are written in only one file the computing part has become the at the end of it. That’s a whole lot proverbial bottleneck of the entire better than how things used to be analysing process,’ says Verachtert. with a series of temporary files that had to be created and read one by one.’ A major obstacle is the fact that the programs have to run in a particular order - i.e. they need each other’s output to work with - so they can’t be run on today’s powerful multicore processors. The software also often has to redo the same thing (matching and sorting). Hence why Verachtert and his colleagues at Imec have been using supercomputers, built by chip giant Intel, since 2010 to speed up the process. Since 2013 they have also been working together with Janssen Pharmaceutica.

Software carrousel The Exascience Life Lab’s collaboration with one of the world’s largest pharmaceutical companies is not a coincidence. Verachtert’s research group and the company’s R&D team have similar goals: shortening clinical trials by making DNA analysis thoroughly more efficient. This public-private -8-

The first tests with eIPrep have proven impressive: the total analysing time for one DNA sample could be shortened by more than 50% (meaning a reduction from 13 hours per genome to only 5). ‘You should realise this: if as part of a clinical trial, for example in the search for a new medicine, there are 300 DNA samples to be processed, this can take days, if not weeks. The time and cost savings made with eIPrep will allow companies to do more trials. Which will ultimately mean better and safer medication,’ says Verachtert.

Personalised medicine Ghent University’s Bioinformatics Institute Ghent From Nucleotides to Networks (Big N2N for short) is also working on better analysing methods for molecular big data. For example by assembling pieces of

genome with each other faster and better in a targeted way, and even automating the whole process. But Big N2N is also aiming at a higher level, in the area of proteomics, which focuses not on DNA and genes, but on proteins. ‘The data produced by mass spectrometry on proteins is not smaller or less complex than genetic big data,’ claims Katrijn Vannerum, project manager at Big N2N and systems biologist at the Flemish Life Sciences Research Institute (VIB).

the genome that can affect gene expression. This technique can be applied in the distant future to cure AIDS and tackle cancer through gene therapy.’

Infectious diseases and prenatal testing

Big data is also being employed in the battle against less malicious or terrifying conditions. Vannerum explains, ‘We’re developing a method that uses big data sets to make more precise and quicker diagnoses and checks possible for She is convinced that the fast diseases caused by bacteria. And processing speed of big data will we’re working on non-invasive ultimately make precision and prenatal diagnostic tests based personalised medicine a reality - big on thorough data analysis among data straight from your bed to your pregnant women.’ bedside table, i.e. patient-tailored medication. ‘Our researchers are tracking down mutations sensitive to therapy, for example, and identifying biomarkers for cancer in the skin and brain (these are substances in our body providing a detailed picture of how a particular illness is fairing, ed.).’

Research into so-called IncRNAs long, non-coding RNA molecules that play a part in the formation of tumours - is very innovative in this respect. ‘By analysing big data sets, we can get a better insight into the role these IncRNAs play,’ continues Vannerum. ‘And last but not least, we’re also working on big analyses of epigenetic genome modifications, reversible chemical changes in

For those readers who are still in doubt: big data is revolutionising plant biology too. Something scientists in the field are particularly mad about, for instance, is comparing the genome of a crop that seems to be vulnerable to pests to one of a resistant kind. ‘By doing that we can find specific resistant genes that breeders and biotechnologists can then work with to grow stronger crops,’ states Vannerum.

-9-

The mark big data is leaving on science has grown so large that Ghent University has now changed its curriculum. The Institute of Permanent Education of the Faculty of Engineering & Architecture and Bioscience Engineering Technologies set up a separate, albeit one-off, big data study programme in 2015. www.ivpv.ugent.be/opleidingen/ aanbod/bigdata2015/index2.htm

More information SymBioSys project, University of Leuven: www.kuleuven.be/ symbiosys Exascience Life Lab, Imec: www2.imec.be/be_en/research/ life-sciences/exascience-life-lab. html Big N2N, Ghent University: www.bign2n.ugent.be

Media and data consumption in Flanders studying or have only gone to work very recently,’ says Bart Vanhaelewyn, media consumption analyst at Ghent University and MICT. ‘They don’t tend to have the financial prowess (yet) to buy absolutely everything they want, so they’re unlikely to have bought a tablet. In terms of online content, they tend to choose free or very cheap alternatives.’

Back in 2009, Lieven De Marez, professor in Innovation Research at Ghent University and director of the MICT iMinds department, set up the digiMeter project to map the average Flemish person’s annual use of (both traditional and online) media and ICT.

Examining the media and data consumption of the average Flemish Joe looks more straightforward than it is, though. You’d think sending out invitations for an online survey would be enough, but you obviously only reach that part of the population which uses the internet already. So in order to keep the user panel as representative and active as possible, De Marez and his colleagues hit the road every year to convince people at markets and festivals, in libraries and railway stations to take part in their digiMeter project. Their work pays off: at least 2,000 people take part in the study every year.

Media omnivores Then there is the second category, the so-called media omnivores. These are mainly people in their 30s who are very well acquainted with digital media and ICT, but have not embedded them into the fabric of their lives as well as the online media masters have done. ‘This group has more financial means to buy devices and content (like subscriptions for Netflix and Spotify) and they also tend to do so,’ Vanhaelewyn continues. ‘This category of people tends to consume a good mix of traditional media topped with a flexible layer of digital content. Almost every media omnivore has a laptop and smartphone.’

Online media masters The study puts Flemish people into 5 categories with original and enlightening names depending on how they use digital media and ICT. The first category, the online media masters, are young people who have grown up with digital media and have hardly known anything else. According to the latest results (from 2014), their laptop is their most prized device, often in conjunction with their smartphone. ‘These online media masters are often young people who are still

Digital explorers Now for the third group: the digital explorers. These are people in their 40s and 50s who have only discovered and learnt to appreciate the great advantages of digital media since they have been using their tablet. ‘These are people who found a computer too complicated and unwieldy to handle, but a tablet is more intuitive,’ explains Vanhaelewyn.

- 10 -

‘Although they’re oriented mainly towards traditional media overall, meaning they tend to read a physical newspaper and watch the TV news, they are slowly starting to find their way in the digital world.’

- 7 in 10 Flemish people spend time media multitasking (i.e. surfing the internet on one device while looking at another screen at the same time). - People in their 40s and 50s use tablets the most.

Functional media users

- In 2014, the smartphone adoption rate was higher than that of normal mobile phones for the first time.

Then there is the fourth group of functional media users. ‘They’re acquainted with the internet and computers, but they only tend to use them when absolutely necessary. Most of all, they like to stay in their comfort zone of classic media, so you won’t often catch them watching a clip on YouTube or downloading music, but they won’t recoil from sending an email or using a word processor to write a letter.’

- This does not mean that the classic text message has disappeared, though. No, young people who often use WhatsApp and Facebook Messenger also tend to send the most text messages. - 68% of people have a Facebook account. - News consumption on mobile devices is increasing dramatically.

Analogue media fans Lastly there are the analogue media fans. ‘This group is rather suspicious of digital media,’ says Vanhaelewyn. It’s not surprising, then, that this group has the lowest rate of internet adoption of all five categories (only 48% indicates they have an internet connection at home, while other categories consistently reach rates above 95%). They don’t need any more for their media consumption than their paper newspaper, magazine, traditional radio and TV (often still one with a traditional glass cathode ray tube).

- YouTube is the most used online music channel. - 3 in 10 people store their data in the cloud.

More information

A few other striking results from the 2014 digiMeter:

digiMeter, iMinds: www.iminds.be/en/gain-insights/ digimeter iMinds Research Group for Media and ICT (MICT), Ghent University: www.ugent.be/ps/ communicatiewetenschappen/en/research/mict/

- 11 -

- 12 -

- 13 -

The virtues and dangers of big data - 14 -

Although there is no lack of

because a train can suddenly be

online travel planners these

delayed at the last moment... but

days, their efficiency - certainly

of course that’s old news for train

those promising real-time travel

passengers in Belgium.’

information - is often still below par. Passengers have to sift through several sources and use multiple apps to find the information they need. The TraPIST project, supported by iMinds and the Agency for Innovation by Science and Technology (IWT), wants to do something about this. Not by drinking lots of beer, as its tasty name would suggest, but by creating Train Passenger Interfaces for Smart Travel, TraPIST for short. The aim of the project is to create an interface presenting all the information train passengers need in one easy overview. ‘First of all we need to know what elements are important for an optimal travel experience,’ says Jabran Bhatti,

Enriched information Bhatti calls the travel information

to replace the way we used to work on computers: saving files on our own hard disk. And there is more in that cloud than we realise: think of all our profiles on social media networks, for example, or all the searches we put into Google and Yahoo! every day.

people see on their smartphones, and which could be displayed on the screens in the station or even on

Working in the cloud

board trains, enriched information. ‘The way the displaying of travel advice is designed has been influenced by an innovative process we call co-creation,’ he continues. ‘So we didn’t only use the testing panel as a source of information, but also as fellow developers, if you will. The idea was that TraPIST should supply exactly the information train passengers need at the very moment they need it, in the way they want it.’

project leader at Izegem electronics company Televic, a business participating in the project. ‘To do this we have enlisted the help of some test subjects with different travel profiles to lend a hand and certainly play a part in the design process where we will focus

Efficient real-time travel

specifically on their experience.

information is only one digital

We will be developing a software

application that can make our

platform based on the results of

lives easier. Working in the cloud

the test panel that will collect data

is another. The cloud, a virtual

from multiple sources. Then we’ll

infrastructure with shared soft- and

let smart analyses, classification

hardware we don’t need to maintain

mechanisms and filters loose on the ourselves - though this means data. Everything should of course

we cannot manage it ourselves

happen in a dynamic context,

either! - seems to have everything - 15 -

But of course there needs to be someone to maintain all those clouds and update the software on the supercomputers that keep them up and running. (Re-)programming that software is a gigantic task for the world’s best ICT specialists, many of whom work in Silicon Valley - the Valhalla of internet technology. Well, now they receive some help from Flanders, from Hasselt University to be exact. During research for his PhD, Tom Ameloot, postdoctoral fellow at Hasselt University’s Databases and Theoretical Computer Science Research Group (DTCSRG), developed a number of tools that have made programming cloud software much faster, with fewer errors and more efficient. Ameloot wrote a logical foundation for

Bloom, the language lots of cloud

hand, they can all sufficiently run

software runs on. It makes using

by themselves and serve users in a

cloud solutions all over the world

parallel way, then it’s easier to add

simpler and faster. ‘I mainly

new ones. You’ll get a bigger and

constructed a theory in my research looser cloud that way, admittedly with no or very little overall cothat tried to provide insight into which kind of programming

ordination. My guess is that many

techniques could be efficient in the

clouds taking care of data storage,

cloud, and which not,’ he clarifies.

like Google Drive, are of this loose type.’

Ameloot’s group is doing a lot of research, both theoretical and applied, into working in the cloud and big quantities of data in general. ‘My colleagues Bas Ketsman and Frank Neven, for example, are occupied with cloud computing theory,’ he continues, ‘and one of our PhD students is working on a way to make executing calculations based on large quantities of data more efficient.’

Hackers, viruses and worms ‘Joining the forces of IT with automation networks is causing a lot of movement in the industrial sector,’ says Johannes Cottyn, assistant professor in Automation at Ghent University and project leader at XiaK. ‘The implementation of Ethernet-based networks in production halls has made it possible to get easy access anywhere

Online burglary

and at any time to the whole

Making the switch from the old,

occur, not only the company’s

analogue era to the new digital one of course looks all rosy, but it entails some dangers as well. Ones we do not consider at all or only very little. Where burglars used to have to conquer physical barriers

production process. If problems own staff, but also suppliers of machinery and automation systems can log on from a remote location to diagnose the problem and even make changes to the installations.’

or sheer human force in order to steal sensitive information, they can now do it online - maybe even from the other side of the globe! Stealing data is not the only thing

And where is the limit of working in hackers do either, as they can also the cloud? Is the cyber sky the limit mess with company processes from

That’s all wonderful, but the implementation of such networks also opens the door to virtual threats from hackers and infections with computer viruses

says, ‘It all depends on the cloud

a remote location or even shut them and worms. ‘Recent studies have shown industrial control systems down completely. It is downright

application in question. If you want

terrifying to think of them hacking

to co-ordinate individual servers

into the internal network of a

overall, the network shouldn’t

nuclear power plant that was set

be too big, because calculations

up to make remote operation of

will slow down too much. All the

the processes possible in the first

computers in the network have to

place. This makes industrial security

in this case? About this, Ameloot

wait for each other or mostly for the an important area in data science majority anyway. If, on the other

research. - 16 -

are increasingly targeted by such attacks,’ continues Cottyn, ‘so industrial security should be a

major area of attention in the design of any automation network.’

Test centre for industrial security

Cottyn also took the initiative to set up XiaK of which he is now project leader. XiaK stands for Centre of eXcellence in Industrial Automation Kortrijk and works under the auspices of Ghent University - Campus Kortrijk and Howest, University College West Flanders. ‘We started a TETRA project in 2015 to support companies with industrial security,’ he explains. ‘Let’s say, we set up an industrial security test centre.’

The primary aim of this TETRA project is to tackle the very current issue of industrial security at its source and to come up with field solutions. To achieve this, Cottyn and his colleagues at XiaK made a list of straightforward goals: -

Setting up an industrial security test centre. Cottyn explains, ‘Our XiaK lab is equipped with some typical models of automation networks so we can do tests in a secure and controlled environment. In this simulation setting, we can develop and apply targeted

secure with a step-by-step attacks on a wide range of approach involving several automation components, aspects: physical security, patch network configurations and management and network industrial control systems, architecture.’ which will enable us to uncover the vulnerabilities of old and Because industrial control systems new technologies.’ and network technologies are Stimulating general awareness applied in such a wide range of areas, the TETRA project has great with online quick scans potential for companies and even of industrial security. ‘We social profit organisations. ‘It’s not made an inventory of various only automation and production automation components and companies that belong to our wider technologies with a list of target group,’ says Cottyn, ‘but also risks, conditions and points of attention for every element. utility companies supplying water, gas and electricity and even some For such a quick scan, users secure government institutions (like (production and automation prisons).’ companies) just enter their configuration and they get tailored feedback,’ he says. Developing and validating a structured approach to execute industrial security audits. ‘A structured approach will enable companies to make their existing automation networks More information TraPIST, iMinds: www.iminds.be/en/projects/2014/03/20/trapist Databases and Theoretical Computer Science Research Group, Hasselt University: alpha.uhasselt.be/research/groups/theocomp/ XiaK, Centre of eXcellence in Industrial Automation, Ghent University (Campus Kortrijk) and Howest: www.xiak.be - 17 -

What if Internet anywhere and all the time. We take it almost for granted. But in many, mainly developing, countries, a good internet connection is still a rare luxury. Rudy Gevaert, affiliated to the ICT Department at Ghent University, is trying to do something about this, in Ethiopia, Cuba and other places. First some context, though. ‘Jimma University in Ethiopia uses, with all its 1,500 computers together, an internet connection (or so-called uplink) of 256Mbit/s. A standard broadband internet subscription in Belgium offers even mere home users 100Mbit/s... So any person surfing in Jimma has a slower internet connection than we here at home.’ The people in Africa need a quicker uplink, then, that’s the logical conclusion, or is it? ‘Yes, that would be the simplest solution, but it’s also the most expensive one,’ he answers. ‘And unfortunately that’s impossible in Ethiopia. The country only has one telecom provider, so there’s no competition, and the prices are high, obviously. On top of this, the country has no coast, so it can’t lay cables directly, and needs to rely on telecom providers

stops? in neighbouring countries for its internet. And the uplink from Ethiopia is limited and saturated already.’

A better solution is using the existing bandwidth better or bandwidth management and optimization (BMO), thinks Gevaert. He sums up the most important tricks of the trade:

Installing software that stores popular websites locally (or caching).

Offering local download servers (or mirrors).

Offering local mail servers. That would mean 95% of all email traffic would stay within the university network. If everyone is using Yahoo! or Hotmail, they waste a lot of bandwidth.

Checking all the websites that get visited for viruses and illegal content.

Disallowing certain websites, or only allowing them - 18 -

outside office hours. Sites like Facebook and YouTube waste a lot of bandwidth.

Gevaert’s approach is only pragmatic, though. ‘We’re of course facing all those other problems every development project is confronted with as well: staff turnover, lack of proper education and training, inadequate electricity supply, and I could go on like this for a while. We have no or only very little influence on these external factors, so we can only make use of the limited bandwidth there is in the best possible way. It’s not that difficult, but executing it takes longer. We’re also investing a lot of time in training local people so they can solve problems themselves. Then the world will be open to them.’

More information Information and Communication Technology Department, Ghent University: www.ugent.be/en/ ghentuniv/administration/dict

Author: Senne Starckx

The thematic papers are published by Research in Flanders, a project run by Flanders Knowledge Area.

FLANDERS KNOWLEDGE AREA

The project Research in Flanders is funded by the Flemish Government, Department of Foreign Affairs. our knowledge makes the difference

Flanders Knowledge Area supports, through different projects, the internationalization of higher education in Flanders, Belgium.

RESEARCH IN FLANDERS

Ravensteingalerij 27 – bus 6 1000 Brussel T. + 32 (0)2 792 55 19 www.FlandersKnowledgeArea.be D/2015/12.812/7

The Flemish Government cannot be held responsible for the content of this publication.

Editions 1. Materials Science 2. Urban Planning 3. Industrial Design 4. Research in Times of Crisis 5. World War I 6. Food 7. Big Data - 19 -