Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing Subsidiaries and their email domains #79

Open
devashish-gaikwad opened this issue Apr 22, 2021 · 11 comments
Open

Indexing Subsidiaries and their email domains #79

devashish-gaikwad opened this issue Apr 22, 2021 · 11 comments
Labels
question Further information is requested

Comments

@devashish-gaikwad
Copy link
Contributor

Hello,

What is the policy for addition of subsidiaries to the OSCI index ?
Is it a decision of parent company or of EPAM/OSCI on how to list the email domains ?

Should all subsidiaries of a company be under same umbrella (and index calculation) of parent company
OR
For each subsidiary with a different email domain, a new company addition (and index calculation) with a subsidiary name should be made ?

For example, a company "X" has 2 subsidiaries "Y" and "Z"

Option A:

company: X
  domains:
    - X.com
    - Y.com
    - Z.com

Option B:

company: X
  domains:
    - X.com
    
company: Y
  domains:
    - Y.com
    
company: Z
  domains:
    - Z.com

Which would be acceptable ?

@abitrolly
Copy link
Contributor

Having more fine grained approach would be interesting in exploring the open source policies between subsidiaries and their parent companies.

@devashish-gaikwad
Copy link
Contributor Author

Yes I agree, perhaps my issue sounds like a feature request but it is actually a Question

@cm-howard
Copy link
Collaborator

Thanks for the conversation so far - this is certainly something we've been discussing as a team too so it's great to hear it being raised from the community.

As a topic, we don't want to really prescribe how Organisations should define themselves within the rankings so instead we're hoping that those responsible for such decisions take a mature approach that's in the spirit of what we're attempting to measure and doesn't simply allow for large groups of companies to dominate the rankings.

Instead, including smaller subsidiaries in your parent grouping (Option A) makes sense when those companies still work closely together on their Open Source engagements. However, at the same time we'd hope an organisation such as Google wouldn't go ahead and list Youtube and Nest as child domains and instead split them out as their own entities to reflect their different approaches to Open Source contribution.

What do you think? I'm keen to hear more conversation on this...

@abitrolly
Copy link
Contributor

YouTube and Google Nest are subsidiaries of Alphabet Inc.. I don't think that this holding company commits to open source or cares about it.

It is harder to split aggregates than aggregate pieces. In the end "make it optional" always works.

@devashish-gaikwad
Copy link
Contributor Author

@cm-howard I agree with your observation that Option A makes more sense. Organisations and their subsidiaries whose IT operations are closely tied together should be represented as one.

@devashish-gaikwad
Copy link
Contributor Author

devashish-gaikwad commented Apr 23, 2021

Should I add summary of this discussion to the README.md ?
Something of the sort -
"Currently the decision of listing the subsidiaries of an organisation under a single entry or multiple entries is left to the organisation in good faith"
under this heading

I am just starting in OSS contributions, this might be a good kick-start for me
You are welcome to assign me any beginner friendly issues/requests 😄

@cm-howard
Copy link
Collaborator

Thanks for your comments @devashish-gaikwad and for the suggestion of updating the README too.

We're going to have a look at this as a team and see how we can reflect the discussion appropriately. In the meantime we'll be sure to suggest any potential issues that would be a good starting point. It's great to see your interest in the solution.

@cm-howard cm-howard added the question Further information is requested label Apr 26, 2021
@abitrolly
Copy link
Contributor

I would be interested to see the dataset of companies and their subsidiaries on specific date. Not sure if the dataset should be collected at this repo, but if there is no such source.

@cm-howard
Copy link
Collaborator

That's great to hear @abitrolly - we're actually working on a whole range of improvements for the index at the moment of which one is a flexible date range selection.

I like the idea of an awareness of supporting subsidiaries too for each Organization so we can think about how we might include this as a potential feature. @vlad-isayko @Uliana2019

@abitrolly
Copy link
Contributor

abitrolly commented Apr 27, 2021

To take the idea a bit further, it is also interesting to see when open source commits that are made by a company were contributed as a paid contract with another company. In that case the real contributor could be the company who paid.

@patrickstephens2
Copy link

patrickstephens2 commented Sep 27, 2021

To the original question..
When we originally defined the list of companies for OSCI and domain to company mapping, we looked at the domains we saw occurring in the data and mapped these to companies to the best of our abilities. We did lots of googling, discovering where domains were subsidiaries of other companies (e.g. egencia.com a subsidiary of Expedia Group). We excluded freemail addresses - which is easy for well-known US providers, but more work for those around the world. And so on.

Subsequently, more than once, this was reviewed and extended.

We used some rules of thumb along the way. Example 1: if we became aware that large company acquired a small company, we would typically roll the domains of the small company under the big one. Example 2: if a company acquired a company which was a major and well-known player in the open source world, and it looked like that acquired company would continue to operate as a relatively independent entity, we would tend to keep them as separate companies.

At the end of the day, this was "best effort" and may have errors and omissions. As an open source project we wanted to encourage companies to submit pull requests with their own "additions and corrections". I've seen some of these, which is great. IMO this is the way to go forward rather than have any hard-rule. But guidelines would be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants