15 minute read
Freeware For Freelancers
Finding Your Place in the New World
By Matthew Bauer, Marron Bingle-Davis, Mike Bingle-Davis, Ashley Douds, John McLeod, and Kristoffer Rimaila.
PIVOTING. TRANSITION. These buzzwords are inevitably brought up in every conversation about the current state of employment for geoscientists. Social media platforms and the internet are rife with groups, webinars, and workshops all indicating that they might provide for a smooth transition for geoscientists into a more valuable and desired career. The unfortunate truth is that a new career cannot be found in a multi-day online conference any more than a two-hour leadership seminar can turn someone into a Fortune 500 member.
Geoscientists have a skillset that is specifically designed to address the issues of mineral extraction, oil and gas production, maintaining a high degree of environmental standards, and unique perspectives on the cycles and timing of earth processes. Know your worth and establish the foundation that will allow you to continue to apply your valuable skills. Whether you’ve found yourself unemployed or are among the fortunate that currently have jobs, it’s critical to think about the steps you can take to prepare for the future.
Geoscientists affected by the downturn and pandemic lost access to the high-dollar platforms and packages that came with corporate employment–advanced programs designed to provide sedimentologic and stratigraphic analysis, geophysics, facies modelling, and even basic mapping. Some software companies offer reduced rates for the unemployed, but many times it’s still too expensive.
Fortunately, a multitude of geoscience software, from GIS to seismic, is available online for free. This article summarizes the resources presented in the Freeware for Freelancers. The two-part webinar series provided introductions and in-depth summaries of several free software for geoscientists. Recordings of the original presentations are available on YouTube.
This article provides an overview of the following freeware:
• Office suite: Google
• Image editing: GIMP
• Data analysis: Power BI
• Mapping: QGIS
• Seismic interpretation: OpendTect
• Programming: Python
GOOGLE PLATFORM
Mike Bingle-Davis – Kirkwood Oil & Gas mikeb@kirkwoodcompanies.com
The most important underlying element to any functional office is a platform that provides spreadsheets, word processors, and presentation abilities. A Google account provides all of this and more.
These include, but are not limited to, a personal YouTube channel, business website, online data drive, financial insights, and should you be interested in the next step – Colab, a cloud-based Python platform. With a Google account, you have a platform that provides an online presence, websites, video content, and shared drive space.
An estimated 100 million global students and educators use Google and 30 million of these are on Chromebooks. This creates an environment where a majority of the incoming workforce will be fluent with this platform, so it is beneficial to learn it. With a Google account, you now have a relatively robust office program platform to work with.
GIMP (GNU IMAGE MANIPULATION PROGRAM)
Marron Bingle-Davis – Sunshine Valley Petroleum Corp marron@sunshinevalleypetroleum.com
GIMP (GNU Image Manipulation Program) is a free, open-source cross-platform image editor available for GNU/Linux, OS X, Windows, and other operating systems. GIMP provides multiple advanced graphic editing tools for any user interested in image manipulation from graphic designers to professional photographers to scientists. GIMP is available for download through their website at https://www. gimp.org/. Some of the most commonly used features are painting, photo retouching, batch processing, image rendering, and image format converting; however, the options offered by GIMP are extensive (Figure 1).
In geology, GIMP can be utilized for map editing, creation and editing of cross-sections, creation of expert diagrams or exhibits, annotations, deleting backgrounds, merging images, and much more. Examples include cropping and resizing of images through the “crop” and “unified transform” tools and cleaning images through “clone,” “color picker,” “bucket fill,” and “eraser” tools. If a computer generated map has extraneous elements as relics from the mapping algorithm, one can easily use GIMP to erase or modify these elements to create a more professional looking result. Likewise, if one wanted to add text or other annotation elements after the map is generated, GIMP’s “text” tool is there for easy annotations. Background deletion can dispense of unwanted image elements through GIMP’s “foreground select” tool. This tool is great for removing any background elements that distract from the clarity of your images. GIMP also has several paint features that allow you to highlight and illustrate images for creation of diagrams or exhibits for inclusion in presentations, papers, or official documents.
GIMP is highly useful in multiple aspects of geology including sedimentology/stratigraphy, petroleum, paleontology, mining, or any situation where you need to easily modify images without spending a fortune on image software.
POWER BI
Ashley S.B. Douds – Core2Core Geologic Asbdouds@gmail.com
Microsoft’s Power BI Desktop is the software company’s business intelligence platform. Like Spotfire or Tableau, Power BI allows users to visualize data in numerous formats and interrogate large, relational databases. The desktop version of Power BI can be downloaded and installed if a license of Microsoft Office is either purchased or rented by the user.
Power BI contains numerous visualization options including basic maps, ArcGIS plug-ins, scatterplot, treemaps, waterfalls, pie charts and more. It can also intelligently import data. The import process allows you to evaluate the graphical distribution of data before import to make sure any database errors are cleaned up. Furthermore, the Power Query window keeps track of all steps applied so they can be undone easily.
When it comes to flexibility, Power BI allows users to create Groups in order to combine data that normally would not plot on the same graph. For example, if a database included a Formation column and various entries including “Green River Formation”, “Green River Fm”, and “Green River Fm.”, you can combine those data in a Group called “Green River Formation” so those data plot as the same Formation on all visualizations. In this case, the only difference is the way the Formation name was entered into the database. The data should plot together, and the Groups function can assist with that.
Once visualizations have been created, filters and slicers can be applied depending on the ultimate format of the workspace. If the intent of the dashboard is for interactive filtering by the data analyst, filters are the best option and can be quickly applied and removed for just a single graph or all pages in the workbook. In the case of an end-user interacting with the visualizations, a permanent slicer with commonly applied filters can be added to the dashboard to create a better end-user experience.
Interactive data selection is also a key feature of Power BI. When datapoints are identified on a single graph, the other graphs on the page will update to reflect the range associated with that datapoint. For example, when the datapoints from Alaska are selected on the scatterplot (Figures 2 and 3), the map zooms into those datapoints, and the treemap and pie chart highlight where those data fall on each graph by adjusting the shading.
Power BI’s strength as a data evaluation and interpretation tool comes from its ability to intelligently import data, allow users to create data groups, create linked graphics, and filter data for individual graphs or the entire workbook, and numerous other functions. The software can also generate dashboards for end-users to interact with the data without having to know all the buttons to be pushed for analysis and interpretation. These end-user dashboards can be generated in desktop or mobile format.
To quickly learn the ins and outs of Power BI, consider taking one of the free online courses, such as those available from edX. There are also several resources available to the Power BI community including a blog, forums, and tutorials. All of these features make Power BI a great addition to a freelancer’s software toolbox.
QGIS
John McLeod – Source Rocks International mcleod1999@gmail.com
QGIS is the leading open-source Geographic Information System (GIS) that stores and analyzes geographically referenced data in a relational database and displays it as a stack of viewable map layers. The core program is written largely in the Python language, and its basic utility is greatly expanded by plugins developed by a global user community. It also incorporates the functionality of other opensource GIS programs such as GRASS and SAGA. It is most similar to the commercial ESRI ArcGIS platform and overlaps in function with many specialized geoscience mapping programs (Figure 4). For newly independent geoscientists who might chafe at the cost of software to perform basic mapping functions, QGIS offers users a good alternative to perform the key geoscience and mapping functions of commercial programs at no cost for the software.
Some of the more important functions include:
• Interpolate, grid, and contour point data
• Georeference raster images
• Query and filter
• Link map features to external data, websites, or handler applications
• Perform complex analysis and editing of geospatial files
• Import and export many raster, vector, and data table formats
• Convert geographic datums and projections
• Symbolize, label, and design high-quality maps
• Display and analyze many external sources of free served GIS data
• Uses 2D, perspective 2D, and triangulation 3D visualization modes
OPENDTECT
Kristoffer Rimaila – dGB Earth Sciences kristoffer.rimaila@dgbes.com
OpendTect is a free, open-source seismic interpretation platform designed to cover everyday seismic interpretation tasks for generalists as well as more research-oriented scientists. OpendTect has been around since 2003 and is now available under the GNU GPL licensing policy. This guarantees that end users are free to use, run, study, modify or share the code. Although this platform can be expanded with third party plugins for more advanced workflows, such as machine learning, seismic sequence stratigraphy, and inversion methods, many tools are available for the freelancers in our industry at no cost. Starting from data loading, viewing SEGY files and even manipulation of corrupt or otherwise bad SEGY files are all available in the free, open-source version of OpendTect. The daily bread and butter interpretation tasks such as well-ties and synthetics, horizon and fault interpretation and attribute analysis are also available (Figure 5). The attribute list features numerous unique attributes, including five modes of spectral decomposition!
For the code-savvy freelancers, C++, MATLAB, and Python code can be integrated via the attribute engine. This way, your own code can be utilized with OpendTect as the visualization platform. Plugin development is also available either via Python libraries or through CMake. Recently, a Github repository was made available with examples of machine learning tools that can be utilized as such, or modified to your liking.
Scientists keen on trying out advanced features in OpendTect can do so by using either the F3 Demo or Penobscot datasets, both available on the dGB Earth Sciences seismic marketplace called TerraNubis. At the time of writing, all dGB plugins are available and free to use and demo in these two surveys. Helpful resources such as self-study or instructor-led courses such as those hosted by dGB Earth Sciences and other venues are available. For more information, reach out to us at info@dgbes.com or visit dgbes.com.
PYTHON
Matthew Bauer – Sabata Energy Consultants and Affiliate Faculty at Colorado School of Mines matthew.w.bauer.pg@gmail.com
WHAT IS PYTHON?
Python is an open-source, object-oriented, and general-purpose programming language that has found wide use in scientific circles. This is largely because the code is clear, logical, and tolerant of whitespace making it easy for humans to read.
So why would you want to learn to program in Python as a geologist? With a wide variety of opensource packages, or software shared by others, Python allows you to expand your toolbelt rapidly without additional cost. For geologists, these packages allow us to utilize file types otherwise only fully functional with expensive commercial software including well logs, seismic, and shapefiles. Python also allows you to collect data via APIs and web scraping, improving our understanding of natural systems through superior data coverage. Once you have your data, the package pandas eases the opening, cleaning, filtering, and merging of data in preparation for interpretation. For larger datasets, the use of pandas is an order of magnitude less effort compared to MS Excel.
Tired of cleaning up 99 different spellings of the same word in a dataset? Check out fuzzywuzzy which uses Levenshtein Distance to match strings with slight variations. Do you need to store and access large amounts of data in relational databases like SQL? Python allows you to integrate multiple flavors of SQL databases into your projects. Tired of time-consuming repetitive workflows that don’t require abstract thought? Automate them with Python to free up time to work on other projects.Have too many variables, a huge dataset, or too fast of a data stream to wrap your mind around in order to make it usable? Packages such as Scikit-Learn, PyTorch, Keras, or TensorFlow allow you to train and deploy multiple types of machine learning models to help make value with an otherwise overwhelming dataset. Have projects without absolute inputs that you need to evaluate risk on? Building Monte Carlo models with Python allows you to quantify the distributions of possible outcomes. Need to analyze and predict values over spatial systems? The package geostatspy brings the GSLIB: Geostatistical Library to Python.
Whatever your task, Python enables you to be creative with finding solutions rather than being restricted to the software tools you have.
HOW DO I GET STARTED?
Don’t be intimidated by a supposed “10,000- hour” barrier to entry of being able to use Python. You’ll start to see usable skills with an investment of about 20 hours. To accelerate the learning process I’d recommend a general course such as Dr. Chuck’s Python 4 Everybody or MIT Open Courseware’s introduction to Python. After you have a grasp on basic syntax, find some good examples of applied workflows to help you get started using it on your own projects. The Colab notebooks for my Practical Python for Geoscientists are available on GitHub; many of which have recorded lectures available. Additional resources include Dr. Michael Pyrzc’s geostatistics lectures, knowledge and training resources on You- Tube, and the December 2020 Outcrop article “Using Python to improve your geologic interpretations: bottom hole temperature workflow.”
All of this so far is available for free but if you need more hands-on help I’d recommend an in-person short course. Dr. Zane Jobe and I offer Practical Python for Geoscientists through RMAG and if you’re in the Houston area be sure to check out Daytum. io. If you’d like a more in-depth education Colorado School of Mines offers a Graduate Certificate: Data Science - Earth Resources.
Once you get started with Python, I highly encourage you to get involved in the community through the Software Underground (SWUNG) and come to social programming events. If you’re in the Denver area also be sure to check out the Denver Data Drivers lecture series. For folks in the Midland area be sure to put the SPE Permian Basin -- Data Analytics Study Group on your calendar.
CAN I SEE SOME EXAMPLES?
When learning to utilize Python in your own workflows, seeing how others have applied Python and where it excels helps you visualize its potential. Python easily handles the gathering, cleaning and merging of related data while allowing the result to be saved in multiple file formats for further interpretation. With MS Excel’s limit of 1,048,576 rows, large datasets require some creative filtering before merging with a sometimes painful VLOOKUP. With pandas, merging three million rows of analysis to sample locations is a breeze. The example in Figure 6 selects the pyrolysis data then plots maturity from TMAX in North Dakota.
Interested in clay typing but don’t have petrophysical software? The example in Figure 7 uses the lasio, pandas, and matplotlib packages to open LAS logs, calculate vshale, and plot points based on their potassium - thorium ratio over a reference image.
Python really shines in automating repetitive tasks, allowing geologists to spend more time interpreting geology. The example in Figure 8 parses LAS well log headers for bottom-hole temperatures. The 64,000 LAS logs were parsed overnight then merged to location data. With suspect values removed, the data is saved to several formats including a shapefile for use in QGIS.
Looking for formation tops or production data in Colorado but don’t have the budget for a data service or the time to pull them manually? COGCCpy reduces that process to a single line of code. COGCCpy is used by providing a list of API numbers which it then iterates through pulling data from the Colorado Oil & Gas Conservation Commission. This data is then made available as a pandas dataframe (Figure 9).
Finally, working on your own or in a smaller shop may limit peer feedback when defining facies. Defining log facies with unsupervised machine learning can provide an unbiased tool to aid in picking tops or mapping facies distribution changes over a spatial area. This example also adds a Euclidian distance curve which allows the identification of beds with outlying properties (Figure 10).
As a geologist, learning Python will not only provide you with a variety of new tools but it can also change the way you think. Python improves your understanding of scientific concepts by requiring that you break complex concepts into small steps in order to write them into code. While there is a learning curve for using Python, the process will increase your creativity and persistence as you build your programming skills.
CONCLUSION
As we geoscientists know, there remains a lot of work to be done and nowhere near enough time. If you are reading this and are employed, please take some time and lend a hand to those who are not. If you are unemployed, don’t lose hope. Get up, engage with your peers, build your workstation, and write, research and contribute.
We plan on continuing the work we started with the Freeware for Freelancers series, bridging the gaps between the programmers and the users. We all have different perspectives, ideas, and any one of you could hold the key to making real-world differences. We encourage you to contact the contributors to this article: they can lead you to others who are in the same position and are here to help.