Publications - DataScience

2015

Ahrens, James

Supercharging the Scientific Process Via Data Science at Scale Presentation

30.06.2015, (LA-UR-pending).

Abstract | Links | BibTeX | Tags: data science at scale, scientific method

@misc{Ahrens2016,

title = {Supercharging the Scientific Process Via Data Science at Scale},

author = {James Ahrens},

url = {http://datascience.dsscale.org/wp-content/uploads/2016/07/Supercharging_the_Scientific_Process_Via_Data_Science_at-Scale_IRTG2057.pptx

http://www.irtg2057.de/index.html},

year  = {2015},

date = {2015-06-30},

abstract = {Supercharging the Scientific Process Via Data Science at Scale Historically, the scientific process is used to explain a phenomena by iteratively formulating a theory and running real-world experiments to test and improve the theory. Advances in the field of computer engineering, driven by Moore's law (which states the number of transistors per square inch on integrated circuits doubles every year) has fundamentally changed the scientific process in two ways. The first change is the availability of inexpensive but high accurate sensors composed of integrated circuits. The sensors, such as extremely high resolution cameras and signal recorders, enable the collection of scientific data and are used in all scientific disciplines including astronomy, physics, biology and more recently the social sciences. The second change is the addition of highly detailed scientific simulations that run on high performance computing (HPC) platforms. The performance of these HPC platforms has increased by approximately six orders of magnitude over the past two decades from terascale (10^12 Floating Points Operations Per Second (FLOPS)) to petascale (10^15 FLOPS). This performance increase has enabled the creation of extremely detail scientific simulations. These simulations augment the scientific process by providing a proving ground for theories and an environment for virtual experimentation. Both changes produce massive data streams that need to be effectively processed, transformed, analyzed and understood through data science. In this talk, I will present new developments in data science, highlighting how the scientific simulation process needs to change for exascale supercomputers (10^18 FLOPS). Exascale supercomputers are bounded by power and storage constraints. These constraints require us to transition from standard, storage-based, post-processing, data science approaches to intelligent, automated, streaming, in situ ones. I will present a new approach that focuses on automatically identifying and tracking areas of interest, and then on selecting and presenting these areas to scientists. The work will be presented in the context of solving real-world data science problems for the climate and cosmological science communities.},

note = {LA-UR-pending},

keywords = {data science at scale, scientific method},

pubstate = {published},

tppubtype = {presentation}

}

Supercharging the Scientific Process Via Data Science at Scale Historically, the scientific process is used to explain a phenomena by iteratively formulating a theory and running real-world experiments to test and improve the theory. Advances in the field of computer engineering, driven by Moore's law (which states the number of transistors per square inch on integrated circuits doubles every year) has fundamentally changed the scientific process in two ways. The first change is the availability of inexpensive but high accurate sensors composed of integrated circuits. The sensors, such as extremely high resolution cameras and signal recorders, enable the collection of scientific data and are used in all scientific disciplines including astronomy, physics, biology and more recently the social sciences. The second change is the addition of highly detailed scientific simulations that run on high performance computing (HPC) platforms. The performance of these HPC platforms has increased by approximately six orders of magnitude over the past two decades from terascale (10^12 Floating Points Operations Per Second (FLOPS)) to petascale (10^15 FLOPS). This performance increase has enabled the creation of extremely detail scientific simulations. These simulations augment the scientific process by providing a proving ground for theories and an environment for virtual experimentation. Both changes produce massive data streams that need to be effectively processed, transformed, analyzed and understood through data science. In this talk, I will present new developments in data science, highlighting how the scientific simulation process needs to change for exascale supercomputers (10^18 FLOPS). Exascale supercomputers are bounded by power and storage constraints. These constraints require us to transition from standard, storage-based, post-processing, data science approaches to intelligent, automated, streaming, in situ ones. I will present a new approach that focuses on automatically identifying and tracking areas of interest, and then on selecting and presenting these areas to scientists. The work will be presented in the context of solving real-world data science problems for the climate and cosmological science communities.

Ahrens, James

Supercharging the Scientific Process Via Data Science at Scale Presentation

30.06.2015, (LA-UR-pending).

Abstract | Links | BibTeX | Tags: data science at scale, scientific method

@misc{Ahrens2015,

title = {Supercharging the Scientific Process Via Data Science at Scale},

author = {James Ahrens},

url = {http://datascience.dsscale.org/wp-content/uploads/2016/08/Supercharging_the_Scientific_Process_Via_Data_Science_at-Scale_Groningen.pptx

http://www.rug.nl/research/fmns/themes/dssc/symposium/},

year  = {2015},

date = {2015-06-30},

abstract = {Supercharging the Scientific Process Via Data Science at Scale Historically, the scientific process is used to explain a phenomena by iteratively formulating a theory and running real-world experiments to test and improve the theory. Advances in the field of computer engineering, driven by Moore's law (which states the number of transistors per square inch on integrated circuits doubles every year) has fundamentally changed the scientific process in two ways. The first change is the availability of inexpensive but high accurate sensors composed of integrated circuits. The sensors, such as extremely high resolution cameras and signal recorders, enable the collection of scientific data and are used in all scientific disciplines including astronomy, physics, biology and more recently the social sciences. The second change is the addition of highly detailed scientific simulations that run on high performance computing (HPC) platforms. The performance of these HPC platforms has increased by approximately six orders of magnitude over the past two decades from terascale (10^12 Floating Points Operations Per Second (FLOPS)) to petascale (10^15 FLOPS). This performance increase has enabled the creation of extremely detail scientific simulations. These simulations augment the scientific process by providing a proving ground for theories and an environment for virtual experimentation. Both changes produce massive data streams that need to be effectively processed, transformed, analyzed and understood through data science. In this talk, I will present new developments in data science, highlighting how the scientific simulation process needs to change for exascale supercomputers (10^18 FLOPS). Exascale supercomputers are bounded by power and storage constraints. These constraints require us to transition from standard, storage-based, post-processing, data science approaches to intelligent, automated, streaming, in situ ones. I will present a new approach that focuses on automatically identifying and tracking areas of interest, and then on selecting and presenting these areas to scientists. The work will be presented in the context of solving real-world data science problems for the climate and cosmological science communities.},

note = {LA-UR-pending},

keywords = {data science at scale, scientific method},

pubstate = {published},

tppubtype = {presentation}

}

: . .

Ahrens, James

Supercharging the Scientific Process Via Data Science at Scale Presentation

30.06.2015, (LA-UR-pending).

Abstract | Links | BibTeX

@misc{Ahrens2016,

title = {Supercharging the Scientific Process Via Data Science at Scale},

author = {James Ahrens},

url = {http://datascience.dsscale.org/wp-content/uploads/2016/07/Supercharging_the_Scientific_Process_Via_Data_Science_at-Scale_IRTG2057.pptx

http://www.irtg2057.de/index.html},

year  = {2015},

date = {2015-06-30},

abstract = {Supercharging the Scientific Process Via Data Science at Scale Historically, the scientific process is used to explain a phenomena by iteratively formulating a theory and running real-world experiments to test and improve the theory. Advances in the field of computer engineering, driven by Moore's law (which states the number of transistors per square inch on integrated circuits doubles every year) has fundamentally changed the scientific process in two ways. The first change is the availability of inexpensive but high accurate sensors composed of integrated circuits. The sensors, such as extremely high resolution cameras and signal recorders, enable the collection of scientific data and are used in all scientific disciplines including astronomy, physics, biology and more recently the social sciences. The second change is the addition of highly detailed scientific simulations that run on high performance computing (HPC) platforms. The performance of these HPC platforms has increased by approximately six orders of magnitude over the past two decades from terascale (10^12 Floating Points Operations Per Second (FLOPS)) to petascale (10^15 FLOPS). This performance increase has enabled the creation of extremely detail scientific simulations. These simulations augment the scientific process by providing a proving ground for theories and an environment for virtual experimentation. Both changes produce massive data streams that need to be effectively processed, transformed, analyzed and understood through data science. In this talk, I will present new developments in data science, highlighting how the scientific simulation process needs to change for exascale supercomputers (10^18 FLOPS). Exascale supercomputers are bounded by power and storage constraints. These constraints require us to transition from standard, storage-based, post-processing, data science approaches to intelligent, automated, streaming, in situ ones. I will present a new approach that focuses on automatically identifying and tracking areas of interest, and then on selecting and presenting these areas to scientists. The work will be presented in the context of solving real-world data science problems for the climate and cosmological science communities.},

note = {LA-UR-pending},

keywords = {},

pubstate = {published},

tppubtype = {presentation}

}

Ahrens, James

Supercharging the Scientific Process Via Data Science at Scale Presentation

30.06.2015, (LA-UR-pending).

Abstract | Links | BibTeX

@misc{Ahrens2015,

title = {Supercharging the Scientific Process Via Data Science at Scale},

author = {James Ahrens},

url = {http://datascience.dsscale.org/wp-content/uploads/2016/08/Supercharging_the_Scientific_Process_Via_Data_Science_at-Scale_Groningen.pptx

http://www.rug.nl/research/fmns/themes/dssc/symposium/},

year  = {2015},

date = {2015-06-30},

abstract = {Supercharging the Scientific Process Via Data Science at Scale Historically, the scientific process is used to explain a phenomena by iteratively formulating a theory and running real-world experiments to test and improve the theory. Advances in the field of computer engineering, driven by Moore's law (which states the number of transistors per square inch on integrated circuits doubles every year) has fundamentally changed the scientific process in two ways. The first change is the availability of inexpensive but high accurate sensors composed of integrated circuits. The sensors, such as extremely high resolution cameras and signal recorders, enable the collection of scientific data and are used in all scientific disciplines including astronomy, physics, biology and more recently the social sciences. The second change is the addition of highly detailed scientific simulations that run on high performance computing (HPC) platforms. The performance of these HPC platforms has increased by approximately six orders of magnitude over the past two decades from terascale (10^12 Floating Points Operations Per Second (FLOPS)) to petascale (10^15 FLOPS). This performance increase has enabled the creation of extremely detail scientific simulations. These simulations augment the scientific process by providing a proving ground for theories and an environment for virtual experimentation. Both changes produce massive data streams that need to be effectively processed, transformed, analyzed and understood through data science. In this talk, I will present new developments in data science, highlighting how the scientific simulation process needs to change for exascale supercomputers (10^18 FLOPS). Exascale supercomputers are bounded by power and storage constraints. These constraints require us to transition from standard, storage-based, post-processing, data science approaches to intelligent, automated, streaming, in situ ones. I will present a new approach that focuses on automatically identifying and tracking areas of interest, and then on selecting and presenting these areas to scientists. The work will be presented in the context of solving real-world data science problems for the climate and cosmological science communities.},

note = {LA-UR-pending},

keywords = {},

pubstate = {published},

tppubtype = {presentation}

}