I get a lot of mail with questions about our (Undev) Gitlab fork. And… I lazy to reply every times similar letters so mach as I decided to write article about it :) In this article I try to describe fork and them main features.
I do not planned wrote about features a lot of information. If you have some question or can’t understand something - please write comments, I’ll update article.
At September 2012 I was joined to Infrastructure projects in Undev. It was very interesting… but… old Gitorious, Rails 2, ruby 1.8.7ee… Shit. Ok. After some time of terrible work I started discussion with our manager about another system. We have a lot of plans (tickets), problems with Gitorious and support of them.
Challenge accepted and we started research. At November 2012 we had big (very big) table with comparison of difficult systems. After some testing Gitlab wined and…
No!!!… Gitlab do not had some important for us features…
Okay. Let’s go, Guys! We started new features.
At May 2013 we had good installation. But with some changes in architecture, about which I’ll write later. We run Gitlab in production at May 2013.
Below I describe most important features from our fork. Difference between our fork and official repository huge and we can not send all features to upstream because of different reason. Such example as very big changes diff with hard migration - for features, which will be useful for big company.
In Gitlab 3 we do not had teams. Management of user access in a lot of projects was terrible work. If we want give for some user permissions to push to project - we must add user in project team directly. What about hundreds users and thousands projects? As result - we started Teams feature. First version of this feature was not beautiful, some times - hardcoded, and in Gitlab 6 this feature was replaced with Groups by Dmitry Zaporozhets. I agree with them - for little company and teams - Team feature is overhead. But we can not abandon Team feature and rewrite them. Now we support this feature in new implementation.
We make 3 level of user access to projects:
Add user directly to project User -> Project
Add user to group, in which located project User -> Group -> Project
Add user in Team, which can be assigned to project directly, or to group of projects User -> Team -> Project User -> Team -> Group -> Project
This schema very comfortable for our Managers, Project Masters/Owners ;)
Then we (with Andrey Kulakov) started mail notification feature, we selected event-based mail generation way. User can subscribe on base Entities: Project, Group, Team, User. After create Event on User action we create mail notifications, based on this Event, and async send them. More detailed I’ll describe mail notification later. Now about events.
At beginning we had one huge problem. All events in Gitlab related to project. We had not events for Group or Team or another Entity. Only Project.
Event(id: integer,
target_type: string, # Polymorth association with
target_id: integer, # Entity, in which was event
data: text, # Serialized event data
project_id: integer, # Only project ;,( It so sadly...
created_at: datetime,
action: integer, # Action number.
author_id: integer)
And all events described with 9 action constant:
CREATED = 1
UPDATED = 2
CLOSED = 3
REOPENED = 4
PUSHED = 5
COMMENTED = 6
MERGED = 7
JOINED = 8 # User joined project
LEFT = 9 # User left project
So, it was a problem… Here could be image with angry Cat, but we started new Events feature.
We had next requirements:
As result we have some solution: Any entity can be target
Any Entity can be source
And rich action description (part or them):
GENERAL = [
:created,
:updated,
:commented,
:deleted,
:added,
:removed,
:joined,
:left,
:transfer,
]
COMMENTS = [
:commented_merge_request,
:commented_issue, # not used in our fork
# because we use another issue tracker
:commented_commit,
:commented, # not used after remove Project Wall
]
MERGE_REQUESTS = [
:opened,
:closed,
:reopened,
:merged,
:assigned,
:reassigned,
:resigned,
]
MASS = [
:imported,
:members_added,
:members_updated,
:members_removed,
:teams_added,
:teams_removed,
:groups_added,
:projects_added
]
GIT = [
:pushed,
:created_branch,
:deleted_branch,
:created_tag,
:deleted_tag,
:protected,
:unprotected,
:blocked,
:activate,
]
# NOTE actions which can be parent
BASE = [
:create,
:update,
:delete,
:open,
:close,
:reopen,
:merge,
:block,
:activate
]
Event can have parent event. For example case:
We create events for any actions in system. But save parent-child relation.
Push created |- Event(action: pushed)
|- Event(action: commented_commit)
|- Event(action: closed) # for MergeRequest
| |- Event(action: created) # Note was created
| |- Event(action: commented_merge_request)
|
|- Event(action: closed) # for Issue
|- Event(action: created) # Note was created
|- Event(action: commented_issue)
Based on this events we can send email for different subscriptions without duplications. And we can research source of some troubles. I think it awesome! :)
After rewrite events we have:
I am not exaggerating. Why I wrote some header you understand later.
After rewriting events we created own mail notifications.
Our workflow:
At this moment we have more 100 different mails for different cases. And can be more :)
TODO: Our plans create notification page with option to show notifications or send them on email. TODO: Add ability to subscribe on MergeRequest and Issue. Replace participants with auto subscriptions. TODO: Add ability to create ignore subscriptions.
Gitlab has integration with different services and it’s OK, but it implementation is ok for them, as SaaS. Why? In Gitlab code developers describe fields and logic of services. If we open project services page - Gitlab create empty records for everyone enabled in Gitlab service. For really needed service user enter some settings. Every time some similar settings user fill for different projects. It’s so sadly… Service can only send web hooks. But, what about provide access to project code? Write comments/issue?
We rewrote services, because:
Now:
If you create service pattern without default settings - user must fill service settings every time. Like in upstream %).
What about search functionality in Gitlab?
In official Gitlab CE we can search:
Projects, Groups, MergeRequest, Issue - %like%
query. Code - git grep
…
At this moment Gitlab core team prefer PostgreSQL. So, in PostgreSQL we have good full-text search, but. We want more flexible search (sometimes user search with mistakes in query). And code search across all repositories. On this reason we replaced search with ElasticSearch.
Example of results:
We can search in different entities:
We can search code across different repositories (as you can see - we can search filter search with different Language)
Resume: At this moment search available in:
For integration with ElasticSearch I wrote gem. I tried save interface, but it is impossible without code rewrite. We have plans to create PR into official repository, but not in the fast time.
So historically, that our company uses Jenkins. And we started research, how we can integrate Gitlab with Jenkins.
We found Gitlab Hook plugin.
Ok. Jenkins + plugin -> Gitlab…
Ok. A lot of projects…. Huge count of MergeRequests… As result we get fulltime pulling. How do you think that happened in the end? Yes. Gitlab is Die. Sadly.
We wrote own plugin for Jenkins. Thanks to Alexey Sorokin.
How it work now:
-> User pushed code
|- Gitlab run services hooks
|- Gitlab service select data and send them to Jenkins
|- Jenkins run build and after them send build data JSON to Gitlab
|- Gitlab parse data and show them in Web UI.
As result we know - which push broke code and can fix them in short time.
And such results in Merge Requests :)
Many developers are involved in many projects. Find some of the projects is difficult. Necessary to use the search. This extra step and inconvenience. We decided to somehow identify projects in which the user is currently involved, or watched over. Result of brainstorm - we started Favorited projects feature.
User can add project in Feature list.
When user added project to this list - marked projects rendered in top of projects in dashboard sidebar:
And user can filter dashboard events feed. Show events for only favorited projects.
List of favorited projects can be edited in profile section.
And all this features available for Group, Teams and Users.
We replaces pulling for new notes in Merge Request, Commits and Issues with websockets. PR here.
Users:
.
├── some_git_home_path
│ └── git
│ ├── .ssh
│ ├── gitlab-shell # symlink to /some/apps_path/gitlab-shell/current
│ ├── ...
│ └── repositories
├── some
│ └── apps_path
│ ├── gitlab
│ │ ├── releases
│ │ │ ├── release_1
│ │ │ ├── release_2
│ │ │ ├── release_3
│ │ │ ├── release_4
│ │ │ └── release_5 # here only code
│ │ ├── shared
│ │ │ ├── bin
│ │ │ ├── bundle
│ │ │ ├── cache
│ │ │ ├── gitlab-satellites
│ │ │ ├── log
│ │ │ ├── pids
│ │ │ ├── public
│ │ │ ├── .secret
│ │ │ ├── system
│ │ │ ├── tmp
│ │ │ └── uploads
│ │ └── current # symlink to last release
│ └── gitlab-shell
│ ├── releases
│ │ ├── release_1
│ │ ├── release_2
│ │ ├── release_3
│ │ ├── release_4
│ │ └── release_5
│ └── current # symlink to last release
└── some_services_path
├── gitlab-resque-main
├── gitlab-resque-gitlab-shell
├── gitlab-resque-main
├── gitlab-resque-elasticsearch
├── gitlab-web-unicorn
├── gitlab-web-unicorn-api
├── gitlab-web-faye
└── etc
Gitlab-shell fork for git protocol ability
Our vagrant vw for development
Elasticsearch-git gem for Integration with ElasticSearch