Blog - 6 Jun 2022
7 minute read
Data Science & Analytics
EARL
Open Source
R Community

Blog

Driven by incredible communities, R and Python have quickly become the de-facto standards for analytics. But how do you move away from proprietary tech like SAS gracefully?

analytic conversion projects, SAS migration, SAS to R migration, open-source migration,

True

True

False

2022-06-05T23:00:00Z

Moving from SAS to R: six guiding principles to assure success.

Analytic conversion projects are often as much about people and decision-making as they are about code. Tom Bowling, a Senior Data Scientist at Ascent, reveals how to plan, execute and enable a successful implementation. 

article

Data Science & Analytics

//images.ctfassets.net/k26sw1bgepr3/4TTGvs1fTh8SjnpquZDXDh/8e801ebe790bc3aa9169e5e178154acd/iStock-1197257945-2.jpg

As Data Science has become more prevalent, we need technologies to allow us to more quickly leverage cutting edge algorithms and techniques.

Driven by incredible communities, R and Python have quickly become the de-facto standards for analytics, providing environments in which practitioners can quickly import, clean, visualise and model data. These rich programming languages allow practitioners to quickly innovate and create new capabilities based on an expanding universe of novel algorithms.

An obvious starting point when considering such a migration is to define your strategic objectives and motives for change – is this a decision that’s driven by cost? skills? stakeholder engagement or the desire to innovate? In our experience, there are a variety of reasons that organisations are looking to migrate and these typically align to 4 main objectives:

  • Cost: Historically, cost has been a primary driver for migration. Organisations are often looking at options to reduce cost and reliance on proprietary software. Open-source leaders like R and Python offer free options, with code that can be contributed and supported by the community without annual licencing contracts.

  • Capability: In the age of digital transformation, organisations are keen to develop new capabilities around the interpretation of data and leverage modern data science approaches. A key reason for migration is to give teams access to the rich functionality that open-source technologies that R and Python provide, including high quality graphical outputs, cutting edge modelling techniques, flexible reporting and the ability to create applications (using R Shiny for example).

  • Flexibility: The ability to scale and flex a platform in line with your innovation journey is critical. Ensuring that it is fit for purpose in the modern analytic world and supports integration with a range of data science tools should be a focal part of your decision-making process.

  • Skills: Many businesses make decisions based on the skills of their teams, therefore a vital step in your journey is to assess the capabilities of your internal talent. A common commercial shift towards open-source is mirrored by the emerging talent in the marketplace as universities teach in modern open-source languages like R.

An effective 6-step migration.  

Typically, in my experience an effective, successful migration will involve 6 main steps.

  1. STRATEGY: Create an effective strategy for migration, covering each of the following 5 elements, with a clearly defined road map and success criteria

  2. SKILLS: Upskill SAS users in the chosen open-source language, and any other aspects of more modern data science methodologies and approaches

  3. DATA: Migrate any data stored in specific SAS formats onto more accessible and appropriate modern data infrastructure

  4. PLATFORM: Move from a SAS system to a trusted, R or Python centric data platform, with all technical elements required for modern data science

  5. PROCESS: Change workflows to represent new ways of working with the business, within the analytic team, and with R/Python itself

  6. ACTION: Convert legacy SAS code to R/Python and provide evidence and assurance to stakeholders around the newly migrated codebase.

But once you’re through these steps, how do you successfully embed the change in your business and enabling your team to thrive in their new environment?

Focus on skills.  

The usual place that a user’s migration journey will begin is skill assessment and training. Unfortunately, attending the training is only half the battle.

It stands to reason that any training will need to be fitted around business-as-usual tasks in a working week, placing a strain on people’s time. However, for new ideas to properly embed themselves, it is best that your users are given dedicated time away from the formal training to put what they’ve learnt in to practice.

This might seem obvious, but unless properly staked out and protected, this vital task can easily get overlooked in favour of more urgent day-to-day actions. This can snowball, and if you’re undertaking multiple training courses, users may struggle with the content of later courses if they haven’t solidified their understanding of the basics, reducing the efficacy of the whole process and making for a less enjoyable learning experience.

We’ve found offering ‘office hours’ with our trainers either side of a training course can help, allowing individuals time to digest what they’ve learned and check back with the trainer to ensure understanding of materials.

In addition, we have found it highly effective to bolster formal training with more focussed 1-1 or 1-few tutorial sessions to enable users to get to grips with the migration, observing their day-to-day process and helping them translate what they used to do to what they will do in future. These smaller groups can foster greater trust between the trainers and trainees, and ensure that nobody is left behind or too worried to ask a question they think will make them look silly in front of their peers.

Having management buy-in regarding the importance of these additional sessions is important and being empathetic to the demands of individual roles whilst enabling them to learn effectively will go a long way to ensuring a smooth migration.

Encourage a community of practice.  

One of the reasons open-source technologies are so popular is the vast community of users who are willing to support and help each other with any challenges. As well as relying on the external support from your open-source-software-of-choice community, it makes sense to try and build up a community of practice internally.

Create a place where your team can share useful advice and code-snippets, post questions and generally contribute to the overall knowledge base within your organisation. This could be a Teams/ Slack channel, a monthly scheduled meeting or an internal SharePoint/ Confluence site. This can help both to surface new ideas, as well as communicate best practices, and empowers individuals to help each other.

A custom enablement approach. 

Persona mapping: SAS proficiency vs R uptake - what might some typical learning trajectories look like?

Your team will typically fit anywhere along either axis – groupings are for illustrative purposes and only offer high-level guidance.

‘Just get it done’ – This group uses SAS to achieve what they need to do on a daily basis, see it purely as a tool and don’t engage with complex/custom functionality. They will probably not be particularly excited about the prospect of having to learn something new. In addition, they may struggle with some training content, however once shown the practicalities of how to continue achieving what they currently do (often in tutorial setting), they will be OK with the process. Simple IDE (integrated development environment – the software where you write your code) tips and tricks can really help delight these individuals, as tools like RStudio have plenty of little productivity boosting features and keyboard shortcuts that make using them considerably easier than using base SAS. These individuals will likely rely heavily on others in the community when faced with new problems. Encourage them to engage.

New Thrivers – This user group may not be SAS experts but see this retooling as an opportunity to shine. They may put themselves forward to take on additional responsibility after the transition and can act as champions within your new R community, helping others along the way. Encourage them to present new ideas in community forums and share what they’ve achieved.

Adepts – These are the SAS wizards who will pick up R easily and run with it. They will want to stretch their understanding and explore what possibilities the new language unlocks. This  group needs to be engaged to establish internal best practices moving forwards and will help shape the latter half of the migration effort.

Surprise Strugglers – SAS experts who could handle any task you threw at them may not be entirely sure why they’re being made to learn something new, and not necessarily happy about it. They might struggle to step away from the SAS way of approaching a problem (e.g. sequential processing of data vs vectorised approach). It would benefit this group from being shown ‘the art of the possible’ early on to try and drive excitement in the new technology and encourage engagement with the process. Additional support such as code-pairing these people with Adepts who have gone through the same process can also help. If required, roles may evolve away from day-to-day code ownership tasks, but it is incredibly important to retain their knowledge of processes involved with relevant products/projects.

Going into the process with an idea of how your team may experience the migration process is critical - put a plan in place for how to sufficiently challenge those who thrive and manage any difficulties for those who struggle – and will help ensure a successful transition.

Conclusion. 

There are many defensive and proactive reasons for a migration away from SAS to open-source, whether this is cost reduction, access to richer capabilities, more flexibility for experimentation or simply the availability of technical skills. Frequently, however, migration is a consequence of a broader data-driven transformation and a business-wide adoption of modern analytical practices. Whatever the drivers for the migration may be, the recommended steps to success and enablement framework should help with a successful migration for your team.

 

 

 

Tom Bowling

Senior Data Scientist

Ascent

As a Senior Data Scientist at Ascent, Tom applies his statistical experience to help customers solve business problems with data. A Statistician by training, Tom pairs deep mathematical capability with programming expertise to understand and address business challenges. He has extensive experience of SAS and tools like R and Python and has been involved in many migrations to open source.

Lets get started

Lets get started section - Home page

Let’s get started.

We help customers build game-changing products, deliver pivotal data and software projects and build strong internal teams. Got a challenge in mind?

We’re ready when you are.

Get In Touch

False