Reshama Shaikh discusses women in machine learning and data science.

[This article was first published on DataCamp Community - r programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Hugo Bowne-Anderson, the host of DataFramed, the DataCamp podcast, recently interviewed Reshama Shaikh, organizer of the meetup groups Women in Machine Learning & Data Science (otherwise known as WiMLDS) and PyLadies.

Here is the podcast link.

Introducing Reshama Shaikh

Hugo: Hi there, Reshama, and welcome to DataFramed.

Reshama: Hello, Hugo. Thank you for inviting me.

Hugo: It’s such a pleasure to have you on the show, and some of our listeners may not know, but when we initially launched DataFramed, we had a panel down at Two Sigma on on 6th Avenue in New York City around Tribeca, a panel on which you appeared. So from kind of the genesis of this podcast, you’ve been involved in a variety of ways,

Reshama: Right. It was January of 2018, almost a year ago.

Hugo: It was, and I actually hadn’t been down there until two nights ago where I went back to present at Jared Lander’s meetup.

Reshama: How did that go? I saw that there was a meetup there with you.

Hugo: It went really, really well. I gave a talk called “What Data Scientists Really Do, according to 50 data scientists”. It’s essentially my takings from, 50 hours of season one of DataFramed, and I joke, but it’s true, the great thing about giving this talk is that I get to present other people’s opinions and not be held accountable for them.

Reshama: That’s great. Was it recorded?

Hugo: It was and Jared’s put it up, so it’s definitely up there somewhere.

Reshama: All right, I will look for it.

Hugo: Yeah, it was great to launch the podcast with you and it’s great to have you on near the start of season two, particularly to talk about several things that you’re instrumental in thinking about and that you’re passionate about. For example, your work, would you say, WIMLDS? I’m trying to figure out how to pronounce that acronym.

Reshama: Yes. For Short WIMLDS, after saying Women in Machine Learning and Data Science.

Hugo: Great. So very excited to have you have you here today to talk about what you do at Women in Machine Learning and Data Science and Women in Machine Learning and Data Science in general. Your work with NumFocus on the code of conduct, and a blog post that took off that you wrote recently. The title is “Why Are Women Flourishing In The R Community, But Lagging In Python?”

What are you known for in the data community?

Hugo: So I’m really excited to be talking about all of these things with you today, but before we get there, I want to find out a bit about you. So perhaps you could just start off by telling us what you’re known for in the data community.

Reshama: Right, so I’m known for a few things. I’m an organizer, as you mentioned, for Woman in Machine Learning and Data Science. I’m also an organizer for PyLadies. I’m also a board member for Woman in Machine Learning and Data Science. I’ve been an organizer for about four years now and at the time that I started organizing we had two chapters. So it’s been good to see it grow and contribute to that.

Reshama: I also created a repository of documentation for the well known Fast AI Deep Learning Library and that’s been popular. I also give workshops throughout New York City, a few online but mainly in person, And I’ve given a dozen in the past two years. I’m a member as you mentioned that the NumFocus Diversity in Scientific Computing Committee, DISC for short, and I’m known for some of my blogs.

Hugo: That is a wide ranging, I think, resume or description of a variety of things you do. I’m just wondering how you got involved originally. What is your path or trajectory to data science initially?

Reshama: Sure. I have a degree in statistics, I have a masters in statistics, and I worked for a long time as a biostatistician, and when I working there, I started my MBA part time at Stern’s School Of Business. The first couple of years were all core classes in economics, finance, organizational behavior, and then after the first half of the MBA we take elective courses.

Reshama: So I took a course in, I believe it was spring of 2012 it was called Data Mining For Business Analytics with professor Foster Provost, and we used WEKA actually. Towards the end of our course we were analyzing project data, and it mentioned you could do that really easily in Python.

Reshama: I had never heard of Python at the time, downloaded it, but wasn’t sure how to use it. So the following semester, Professor Foster Provost offered another course called Practical Data Science. It was the first time they had it at Stern, the School of Business, and that’s when I started learning Python and I just, I just fell in love with Python.

Hugo: Fantastic. So I was working in cell biology, thinking about data, statistics, cell mechanics essentially as well, but when I started thinking about working in data science in industry, it’s Foster Provost’s book Data Science For Business, which is one of the first books that I ever read in this space.

Reshama: Well that’s so funny because at the time it was a draft copy. So we received his draft copy for free.

Hugo: Awesome. So what happened next in your journey?

Reshama: So well during my time at Stern, I also took other courses, Data Visualization, Networks, Crowds, and Markets, Design Of Apps, and after that, having a statistical background was really helpful. After that I worked at a data science bootcamp and I acquired even more skills in data science.

Hugo: Great. And I think boot camps are definitely … People ask me about the quality of of boot camps and what you get out of them, and I think two very interesting aspects of boot camps are actually firstly the network that you build and where those people end up as well, because you’ll be in a class at a boot camp and a couple of years down the track you’ll have part of your cohort being at Facebook, Twitter, DataCamp, LinkedIn, all of these places.

Hugo: So that’s very exciting, but as you said, the other aspect is you get a broad overview of kind of all the current tools and techniques that are used in data science. So it helps you really establish the knowledge of what your toolkit may need to be.

Reshama: Absolutely. When I was at Stern, they didn’t actually teach us about the Python libraries. We had to do a lot of coding from scratch, and there are things that are not covered in, university programs like git, at the time not even cloud computing. So it was nice to get some of those other supplemental and complementary skills.

Hugo: Absolutely. Yeah. Such as, as you say, cloud computing, like figuring out how to use AWS, right?

Reshama: Absolutely. Yeah.

Women in Machine Learning and Data Science (WiMLDS)

Hugo: I’d love to jump in and hear about your work at Women in Machine Learning and Data Science. So perhaps you could start by kind of giving us a brief rundown of the history of WIMLDS and the mission.

Reshama: Sure. So Women in Machine Learning was founded December, 2013, so five years ago we just celebrated our fifth anniversary by Erin LeDell who’s out in San Francisco, and she started the first chapter there, which is the Bay Area chapter.

Reshama: It was inspired by the Women in Machine Learning workshops, which are co located with conferences such as NeurIPS, ICML and Colt, but the thing about those workshops is they’re only accessible to people who attend the workshops. The goal of the Women in Machine Learning and Data Science meetup groups is to have local communities where more people can participate. We are a nonprofit organization, and as of today we have 37 chapters in 17 countries with over 21,000 members.

Hugo: That’s amazing. How can people get involved?

Reshama: The best way is to look for a local chapter close to you, and we encourage people if there’s somewhere where a chapter doesn’t exist is to consider starting a chapter.

Hugo: So, as stated, the mission is to support and promote women and gender minorities who are practicing studying or interested in the fields of ML and Data Science. How do you go about achieving this mission?

Reshama: One of the things that we do is, for our speaker events we look for women speakers. We want our community of women and other underrepresented groups to see women speaking and in the leadership position and to inspire them. So our panel talks, we’ve require that it’s at least 50% women. We advocate and promote for conferences, and we work closely with some conferences such as the O’Reilly Media and Machine Learning Conference that are really dedicated to having a diversity of speakers.

Reshama: We also work with conferences that really promote a code of conduct. We have a very active slack team where women can network, and ask questions about jobs, and how to approach different issues that work. And network, and we have all different sort of types of techniques such as career development panels. We recently had a terrific panel up with a group of women at McKinsey that talked about the challenges they face at work and how they manage the personal life with work life.

Reshama: We have open source sprints. As , the rates for women being involved in open source community are quite low, and we have, we have hands on technical workshops, and examples are Deep Learning, Neo Four J, Making A Bot, and we really want to promote a supportive and welcoming environment, and also communicative environment.

Reshama: I think one of the reasons that the New York chapter has really done so well is that we communicate with our members and we’ve had a good team of volunteers that’s been consistent for close to four years.

Hugo: We might get back this, but I want to zoom in on the the fact that you mentioned open source sprints, and we might get back to this when discussing why women women are flourishing in R Community and perhaps lagging, as you say, in Python. The open source development community, I think, is one in which we see a serious lack of diversity. That’s an understatement. So I think as you say, having open source sprints where you’re hopefully get women and under represented groups, making pull requests as soon as possible is actually really essential work, particularly on the Python side. Right?

Reshama: I absolutely agree. In fact, I’ve just been spending the past week, we had our second open source spent at the end of September, and we merged five pull requests, and we had about over a dozen that are sort of outstanding. I’ve been working with some other people to get some of those pull requests complete. So it’s a great experience. It was my first scikit-learn in sprint. I’m like, “Wow, there’s a lot that I could have learned in the past five years.”

Other Initiatives.

Hugo: And this is actually the first time I heard about you, your work at Women And Machine Learning And Data Science was when Andreas Mueller who has worked with you on these, these sprints, and for those who don’t know Andy, he’s a maintainer and cold contributor of scikit-learn. He was running a sprint with you, or maybe it was actually a workshop. So it was several years ago, but it was definitely then that I started hearing about, about your fantastic work at WIMLDS …

Reshama: Well, that’s good to hear.

Hugo: … in New York City. So what other types of initiatives do you think about and implement?

Reshama: Other initiatives.

Hugo: We’ve talked about career development, and panels, and open source sprints.

Reshama: We’ve done two hackathons. They’re a lot of fun. They’re also a lot of work, but the hackathons, and they’re really well organized. There’s a lot of upfront work that goes into them, and we work with different communities in the city. So those are really popular too.

Hugo: Fantastic. Do you also have other networking events, or sharing job postings, or that type of stuff?

Reshama: Right. We have a job board on our website where companies can post jobs and we share them. We have a newsletter and also on slack. As organizers, we really tried to get to know our members, and talk to them, and sort of create the space that’s welcoming and open.

Reshama: When I go to our events I will see we tend to have about 50 to 60 people and about 90% are women, and when I go to regular Python data science events in the New York City community, it tends to be about 80%, 90% men. It’s just interesting that we are able to attract so many women to our event.

Hugo: I couldn’t agree more. I’ve got a relatively ill formed question about how a lot of these aspects are cultural and that part of our job is to push back in terms of providing supportive and welcoming environments for everyone. We spoke about Jared Lander, and I had him on the podcast last year talking about how he’s gone about building data science communities with his meetup and his conference.

Hugo: Actually in your post on why women are flourishing in the R Community, you point out that at Jared’s conference, example, he actually provides a very supportive and welcoming environment for a lot of underrepresented groups, including women. In terms of the speakers present, it’s almost at parity in terms of gender.

Reshama: It is. I have attended some of his meetups and I went to the R conference. The last one I attended was in 2016 and it is, it’s really relaxed, and fun, and welcoming. He really puts on great events.

What does 2019 look like for WIMLDS?

Hugo: I couldn’t agree more. What does 2019 look like for WIMLDS? What type of stuff do you have going on?

Reshama: We are growing. I would say of our 37 chapters, 23 were added in this past year alone. So to sustain that growth, we are looking for a sponsor. I don’t know if you’re aware, but R Ladies are supported by the R Consortium, and PyLadies is supported by PSF, and what support he means on sort of a fiscal basis is they cover the meetup dues, and provide meetup pro accounts which makes it easier to manage all of the chapters who communicate with them, but they also sort of provide these conduits in relationships to other organizations within the community. Women in Machine Learning and Data Science is sort of in the middle… I don’t even know if there is a machine learning organization to connect it to, but we are looking for a sponsor.

Hugo: If any of our listeners were interested in checking out the details, where could they do that?

Reshama: The details about Women in Machine Learning, our website is our acronym. wimlds.org. We are very active on Twitter, also the same handle, WIMLDS, and you can email us at [email protected]

Hugo: Great, and if people wanted to find out more information about potential sponsorship, is there a landing page for that?

Reshama: We don’t have a landing page for that, but you can send us an email to our info email address and that gets forwarded automatically to all the board members.

Hugo: Yeah, perfect. So this is kind of a brief wishlist of what you’re working on. What else is on your wishlist for WiMLDS?

Reshama: So these are things that have been on my wishlist for a while and I would love to make some progress on it. One of them is, I mentioned the sponsorship and getting a meetup pro account, but I would love to record our events. We have smaller chapters in, say, Michigan and Texas, and they don’t have access to all the speakers that we are so fortunate in New York City, or the Bay area, or some of these large metropolitan areas. So I would love to record our events in the same way that Jared does for the R Community and to, be able to have that accessible to other chapters, and also to sort of see what, like our Paris chapters very active too. Well fortunately I could understand the French, but see what other people are doing, getting access to space.

Reshama: Probably the most time we spend as a meetup organizers is getting space and it would be great to have great relationships with companies where that was more accessible. I think in the past, even a couple of years ago we would promote conferences without really thinking as much about what their initiatives are and whether they had a code of conduct. So we want to get really intentional about working with conferences that have a code of conduct and would really want to bring more women into Open Source. Ultimately all these initiatives will go down the pipeline and bring women into open source machine learning, data science, artificial intelligence.

How to promote women and gender minorities in data science and machine learning?

Hugo: So we mentioned that listeners perhaps could get involved in sponsorship, but I’m wondering what DataFramed listeners, both individuals and organizations, can do across the board and in general to support and promote women and gender minorities in, in data science and machine learning?

Reshama: I would say the most important thing that I noticed about my interaction with the community being an organizer is, I would say, is be intentional. I will hear from organizations, “We support diversity.” It’s become such a hackneyed expression, but to really be intentional about what does it mean to support diversity, and what do you do, and where do you need help, and creating these inclusive women’s spaces, which means like giving women a chance to speak. I personally would like to hear companies talk about, “We support pay equity,” and seeing some data that supports that.

Hugo: I think these types of statements are incredibly important, because as you say saying, “We support pay equity,” as opposed to “We support diversity,” has a concrete outcome, right?

Reshama: Absolutely.

Intentionality

Hugo: So you mentioned the idea of being intentional, and intentionality, and one example you cited was in terms of having something concrete like supporting pay equity. Is there anything else you want to say about an intentionality in general and what that means on a daily basis for people?

Reshama: I want to point to Write Speak Code. Write Speak Code is a organization also of meetup groups, and they do a national conference, and they are really, I would say, they’ve really been at the forefront of what inclusion is. One of the other things about inclusion is using inclusive language. So often I hear, as an example, “You guys,” and I think it’s important to be really, really careful of how we communicate with each other in the community, because it’s like the default is “You guys,” but really, there are all sorts of different kinds of people.

Reshama: There’s a Discover Cookbook that’s put out by NUMFOCUS about making events more inclusive, to actually research that and to implement it. It doesn’t have to be all of the recommendations, but implement some of them and reach out if people have questions. Also collect and share data about who people are hiring, retaining, and be open to sharing publicly what’s working and what’s not so companies can learn from each other. So often people reach out to us and say, “We want to hire more women,” and “How can we hire more women software engineers, or women data scientists, or women in machine learning? We just can’t find them.”

Reshama: Rather than changing the conversation from, “We can’t find women”, how about adjusting our environments so we can attract, hire and retain more women? I think it’s important to companies to realize that people speak to each other. So when a culture is working or if it’s not working, they communicate with each other at dinner, or via email, or via Slack, and it’s important to have a culture that people speak about positively.

Hugo: Absolutely, and I do think the one flip side of intentionality is also developing awareness around what you’re actually doing and what the practices you have in place, what the downstream effects are. That’s incredibly vague and abstract. I’ll give one concrete example, which is the language of job listings. So I’ll try to dig this up, but LinkedIn actually did several studies into how the language of job listings will even affect the ratio of applicant gender or something along those lines.

Reshama: That’s true, and there has been research that shows, for instance, everybody knows, men will apply for a job if they fit 50% of the list of requirements, whereas women will only feel comfortable applying if they think that they could do 90% or more of the requirements, and knowing that research is out there, companies could adjust their job postings if they want to attract more women.

Diversity and Inclusion at NUMFOCUS

Hugo: You mentioned NUMFOCUS and the diversity and inclusion in scientific computing committee, otherwise known as DISC, that you’re on, and I’m wondering how you think about diversity and inclusion at NUMFOCUS as an organization and what type of initiatives you’re involved in there.

Reshama: I think they’re really, really cognizant about the importance of it and how to sort of move that initiative forward. They’ve been fortunate to receive a grant from the Moore Foundation and they have all sorts of different initiatives. So one of the things that has been their project is to have a comprehensive code of conduct, which, it’s interesting, I thought, “Oh I’m sure somebody else has done it, because I was researching it for the NUMFOCUS committee, and I was surprised to find out that there is actually not too much out there in terms of comprehensive codes of conduct, and defining acceptable and unacceptable behavior, and how to report.

Reshama: That’s really critical and I think that other organizations outside of NUMFOCUS, outside of their projects, all of that information is publicly available. They can access it, and reference it, and use it to make their communities more inclusive and diverse.

Code of Conduct

Hugo: Great, and maybe you can speak to the idea of a code of conduct in general and what it provides for a community at large.

Reshama: Oh, a code of conduct is so important. When I started organizing four years though, we didn’t have a code of conduct and there were some of these behaviors that we would, I would observe at our meetup events. I wasn’t quite sure how to address them. Then codes of conduct came along and then it made it so much easier to to say, “That behavior’s unacceptable, because it violates our code of conduct.” Now I’ve moved forward to sharing the code of conduct before the event begins so people know what behavior is expected. That’s been like a tremendous help, and I think that for people who attend events knowing that a code of conduct is important to the organization. It makes them feel comfortable in terms of attending events and knowing how the culture will be.

Hugo: Once again this comes back to being intentional, making things precise, and making things concrete. For example, this idea of, “We value diversity”, it isn’t clear how you would implement anything around that, whereas writing a code of conduct puts certain precise steps in a precise code down. Right?

Reshama: Absolutely. Actually this is a side thing I wanted to say something about the diversity. As an example, we’ve had companies in the city, one major tech company that’s reached out to us and said, “We want to hire more women but we can’t find them.” So we talked about an event, and after you know many emails and conversations, they said, “We just don’t have the budget for it.” This is a tech company that provides three meals a day to their employees, and they’ve said that they want to hire more women, but then they say they don’t have the budget for it.

Hugo: Yeah that seems hypocritical to say the least, or willing to say something to talk the talk but not walk the walk, so to speak. Speaking of codes of conduct, you have a blog post on codes of conduct in in general.

Reshama: Yes, I do. Yes.

Hugo: We’ll link to that in the show notes but I think that’s wonderful. So I thought maybe you could say a few words about it here on the show.

Reshama: Sure. So that code of conduct, as we were doing the research for the NUMFOCUS DISC committee, it was published and because I knew that there weren’t a lot of comprehensive codes of conduct out there, I wanted to write about it so that other organizations would know that it existed and that, it could help create a welcoming and inclusive and professional environment for them.

Reshama: I really hope that people will take some time to read this blog because it’s not one that has received that much attention, but it’s probably one of the blogs that I’ve written that’s the most important to me, and I think it’s because it has the potential to really help. I’m going to focus on data science and STEM associations, but really it’s applicable to a wide field, a lot of different other fields of work, but once they… Going from, “Oh, we we’re a welcoming community”, this is actually more concrete actions, like having a code of conduct. It can prevent inappropriate behavior in the community, It encourages professional inclusionary behavior, and it provides a safe avenue for communicating violations.

Reshama: I think that, I’ve reviewed about 10 or 12 organizations that serve the data science community. So in statistics, computer science, machine learning, physics, economics or engineering, and all of these organizations, aside from NUMFOCUS and the American Statistical Association have very brief codes of conduct without too much information and membership. Total membership in these vocalizations is 400,000, so if this information gets out to the STEM communities, it really has the potential to have a really favorable impact on inclusivity and professionalism in the entire STEM ecosystem, and also for generations that are coming.

Unconference and the Discover Cookbook

Hugo: As I said, we’ll include the link in the show and if any listeners have any feedback or any comments about it, please get in touch with both of us as well on Twitter or otherwise. So going back to NUMFOCUS and DISC, we’ve talked about the code of conduct, but there are other initiatives such as the Unconference and the Discover Cookbook, and and other things, right?

Reshama: In about a year ago, in November of 2017 there was a DISC unconference, and about 45 people came from the US, a few in Europe, a few in South America, and we worked on different committees. So some of those committees finalized or worked on the DISC events guide on how to make conferences and events more inclusive. There’s a directory of organizations that serve underrepresented groups. There was a committee on getting started with opensource, communicating feedback anonymously, and diversity metrics. So there were all sorts of initiatives and it really is a great organization to get involved in.

Reshama: So many things that I’ve learned through the DISC committee and being at the Discount Conference have helped me and my work with my meetup groups as well as in open source.

Hugo: Great. What’s the cookbook?

Reshama: The cookbook. It’s a list of suggestions that can make conferences and events more inclusive.

Hugo: So Reshama, you worked on the cookbook with several people and NUMFOCUS?

Reshama: Right. The cookbook is a resource that provides actionable items that communities can do to make their events and conferences more inclusive. Not everything is required, but it’s sort of separated into high impact and effort, whether high impact, high effort, low impact, low effort, and they really give, helpful suggestions about selecting venues, and food, making food options accessible to people that have different dietary restrictions, providing child care, grants and scholarships, how to select participants, event registration, having the name tags with pronouns. There’s just all sorts of options.

Hugo: So now I really want to jump in and talk about your recent blog post. Why are women flourishing in the R Community but lagging in Python, and for the listener, I just wanted to give a bit of background into this blog post. Reshama and I, a couple of months ago, were chatting on a call about what a podcast conversation could look like around these issues, And One of my favorite conferences in the world is SciPy, which happens in Austin, Texas every year.

Hugo: I was at a meeting at SciPy in July in 2018, and it came up how to think more about inclusivity and diversity at SciPy, which is a big challenge. I kind of off the cuff mentioned that, anecdotally, at least, the R Community and R conferences seem to be doing a better job at inclusivity and diversity in general.

Inclusivity and Diversity in Python

Hugo: So there was a big conversation around that and I actually asked Reshama about this and she said, “That’s a great question, or you said, Reshama, “That’s a great question.” I was just speaking to the listeners for a second and decided then to go and look into it and published this blog post, which really, really took off.

Reshama: Right. It was sort of a comment at the end, and I had some ideas about why, and I thought, “Well, let me do the research and put it together in a blog post for our conversation today.

Hugo: So maybe you can tell us some of the key takeaways from your research and the post, and then we can discuss the responses.

Reshama: Sure. I would say, the key takeaway from all of that research, and one of the readers, he really said it best. He wrote, “The key takeaway here.” and this is actually Dylan Niederhut. He said “The key takeaway here seems to be that R has top-down institutional support for inclusion where efforts and Python or less support and less connected with each other.” I think that pretty much sums it up.

Hugo: I think that’s a very concise and very pertinent takeaway. What does the data actually tell us or show us? Reshama: This is something that there’s all sorts of, metrics, for instance. It’s not something that’s easily measurable, like taking a blood test and knowing exactly what the white blood cell count is. So the question is, how do you define inclusivity? One of the ways is, look at it, look at open source contributions, the percent that are members of the communities, whether it’s the general communities or the women in this programming language community, and the percent speakers, and how many people submit costs or proposals, how many women are involved in different places in the community.

Hugo: What does the breakdown look like between Python and R?

Reshama: The open source contributions for, I would say for R are just so much higher. One of the things that I did discover in my research is that there’s not a lot of data out there about Python. It’s very limited data, and I think that collecting and reporting data is indicative of how much the community is, where diversity inclusivity is a priority for that community.

Reshama: I think that Python is a bit lagging… Maybe they have the data but, it’s not really, published or so, but in terms of open source, for R, it’s four times the number of package authors that are women versus in Python.

Hugo: Okay. So this idea that R, one of the takeaways, is that AR has top-down institutional support for inclusion. What does that look like in terms of the community? Where does this institutional support come from?

Reshama: Wow. It goes, it is all the way from the top. So the R Community has the R consortium, which manages the, I would say, I know you’re very involved in the R community. So I guess the R Consortium is sort of the equivalent of the Python Software Foundation. Would you agree with that?

Hugo: Yeah, absolutely. I think that’s a nice analog.

Reshama: Okay. All right. So the R Consortium has announced that R Ladies is a top level project for them. And what does that mean, again, what do these words mean? Top level project, it’s important to them. What they do is that they provide a budget cycle for them that, I guess it used to be renewed annually, but they have a budget cycle now that set for three years, and this is really impressive is that they’ve given R Ladies a voting seat on their infrastructural steering committee for the R consortium.

Reshama: Comparatively speaking for the Python Software Foundation, they support fiscally. Like they pay the meetup dues for PyLadies, but it’s only in the PyLadies who are part of the meetup pro account. They haven’t really converted them and they don’t really, they’re sort of part of Python Software Foundation, but there’s no structure there.

Hugo: We’ve discussed how, a lot of these aspects are cultural and historical as well, and I actually have no idea whether what I’m about to ask is it true or not, but at least my intuition and anecdotally knows that the R community historically has been a very statistical academic pedagogical community, which already has a lot more women, for example, involved. Python historically and culturally comes from a software dev background, which doesn’t. So could these two factors play an important role?

Reshama: I really think yes, I would say they definitely play a role, but there is something that I do want to point out is that both Python and R were released in the early nineties. So Python was released in 1990 and R was released in 1993 and so the women’s communities were founded about 20 years later.

Reshama: I think that’s interesting, both of them, for Python and R that they were founded 20 years later. I think that whatever the educational background is, both communities felt the need to create these women’s groups to advance them in the respective programming languages. So that’s really important to consider.

Hugo: I agree. And the other question that springs to mind is, the R community is actually a lot more of a community than the Python community. I actually even, my spidey sense or whatever you want to want to call it, arcs up when people talk about the Python community, because my question is you’re talking about scientific Python community, which is actually a very different structure than the Python Software Foundation. So really the point is that SciPy is very different to PyCon, right?

Reshama: Yes. I have not attended either of them, I think one of the things that Python is it’s used so extensively that there are all different communities, but I think that can also be an asset for the community too. It doesn’t have to be a hindrance.

What needs to change in the Python community?

Hugo: So what would you like to see the Python community do in the coming years?

Reshama: I would like to see more collaboration between them, communication between them. It’s probably okay for them to retained their, identity, and their structure, and all of that, but they can communicate with each other and merge some of their initiatives, because everybody, this is the analogy I would say I would use for the Python community versus the R community.

Reshama: Everyone is committed to diversity. Everybody wants inclusivity, and they’re all working hard, and they all have initiatives, but when I looked at the Python community, it’s like each group is in its own individual kayak and they’re rowing hard, everyone’s looking at the same direction and they’re moving there and they’re all working hard, but the R community has sort of invested in this really large boat and they can move to their goal a lot faster with a lot less rowers to get where they want to go.

Hugo: Right. And I think what you’re speaking to there is that everyone is on board the boat. That speaks once again to it having a strong sense of community there at the top as well.

Reshama: Yes. Yes. I think it’s more of a structural issue.

Favorite Data Science Technique

Hugo: Right. We’re going to wrap up in a minute, but I’ve got a couple of final questions and one is something I ask a lot of my guests, which isn’t isn’t related necessarily to the rest of the conversation, because we haven’t talked about data science a lot, about the technical aspects of data science. So I’m just wondering what’s one of your favorite data sciencey techniques or methodologies?

Reshama: My favorite one I would say is data visualization. I took a course when I was at stern with Kristin Sosulski, and she teaches a terrific data visualization course, and I realized after all the years of being a statistician that I was not presenting my data is optimally is that could be. So visualization I would say is key.

Call to Action

Hugo: For sure. My final question is, do you have a final call to action for the listeners out there?

Reshama: Yes I do. I would say that if you’re interested in starting a chapter for a Woman in Machine Learning and Data Science, feel free to reach out to us. Our email is [email protected]. We have a starter kit and we are happy to answer any questions. I would say I really would love if the blog I wrote, the article about code of conduct for NeurIPS and other stem organizations, it’s really critical and helpful information. NUMFOCUS will be having nominations for the committees beginning in March and so I imagined they’ll post in February or so.

Reshama: I would say get involved in the various committees that can really help with wherever people are working, whether it’s at an company, you are working in academia, or just running their own community… And sponsor. If you’re interested in sponsoring Women in Machine Learning and Data Science also reach out to us at [email protected].

Hugo: So those are all fantastic final calls to action, and in terms of reading your blog posts, I’ll include them in the show notes and I’ll urge everyone when reading them to share them as loudly and as widely as possible, in particular on Twitter. I think as we’ve seen, Twitter is actually an incredible space to have these conversations, at least initially. There is a critical mass of people really willing to engage on Twitter about this as well.

Reshama: Absolutely. I love Twitter. I think it’s a way to, however it’s designed, it’s a way to connect with people that I would not normally know, either meeting them in person or even via some of the other social media platforms out there.

Hugo: Yeah. And even on, these issues, which are particularly sensitive, and where on social media, you’re generally in a lot of other disciplines, expect a lot of trolls to come out. A lot of these conversations stay amazingly civilized on Twitter as well, and constructive, I find.

Reshama: I absolutely agree. I think that’s because people are very committed and are interested in advancing the issues and making the community better for everyone. So I agree.

Hugo: Agreed. Reshama, it’s been such a pleasure having you on the show.

Reshama: You too. Thank you, Hugo, for the invite.

Hugo: Thank you.

To leave a comment for the author, please follow the link and comment on their blog: DataCamp Community - r programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)