I get a lot of mail with questions about our (Undev) Gitlab fork. And... I lazy to reply every times similar letters so mach as I decided to write article about it :) In this article I try to describe fork and them main features.
I do not planned wrote about features a lot of information. If you have some question or can't understand something - please write comments, I'll update article.
At September 2012 I was joined to Infrastructure projects in Undev. It was very interesting... but... old Gitorious, Rails 2, ruby 1.8.7ee... Shit. Ok. After some time of terrible work I started discussion with our manager about another system. We have a lot of plans (tickets), problems with Gitorious and support of them.
Challenge accepted and we started research. At November 2012 we had big (very big) table with comparison of difficult systems. After some testing Gitlab wined and...
No!!!... Gitlab do not had some important for us features...
Okay. Let's go, Guys! We started new features.
- At first - postgresql support. Yes, I do not like MySQL :)
- Migration script from Gitorious to Gitlab
- Mail notifications
- Teams support for permissions management
- Snippets (but now we do not use this feature, because we prefer similar Wiki %) )
At May 2013 we had good installation. But with some changes in architecture, about which I'll write later. We run Gitlab in production at May 2013.
Most valuable Changes
Below I describe most important features from our fork. Difference between our fork and official repository huge and we can not send all features to upstream because of different reason. Such example as very big changes diff with hard migration - for features, which will be useful for big company.
In Gitlab 3 we do not had teams. Management of user access in a lot of projects was terrible work. If we want give for some user permissions to push to project - we must add user in project team directly. What about hundreds users and thousands projects? As result - we started Teams feature. First version of this feature was not beautiful, some times - hardcoded, and in Gitlab 6 this feature was replaced with Groups by Dmitry Zaporozhets. I agree with them - for little company and teams - Team feature is overhead. But we can not abandon Team feature and rewrite them. Now we support this feature in new implementation.
We make 3 level of user access to projects:
Add user directly to project User -> Project
Add user to group, in which located project User -> Group -> Project
Add user in Team, which can be assigned to project directly, or to group of projects User -> Team -> Project User -> Team -> Group -> Project
This schema very comfortable for our Managers, Project Masters/Owners ;)
New event model
Then we (with Andrey Kulakov) started mail notification feature, we selected event-based mail generation way. User can subscribe on base Entities: Project, Group, Team, User. After create Event on User action we create mail notifications, based on this Event, and async send them. More detailed I'll describe mail notification later. Now about events.
At beginning we had one huge problem. All events in Gitlab related to project. We had not events for Group or Team or another Entity. Only Project.
Event(id: integer, target_type: string, # Polymorth association with target_id: integer, # Entity, in which was event data: text, # Serialized event data project_id: integer, # Only project ;,( It so sadly... created_at: datetime, action: integer, # Action number. author_id: integer)
And all events described with 9 action constant:
CREATED = 1 UPDATED = 2 CLOSED = 3 REOPENED = 4 PUSHED = 5 COMMENTED = 6 MERGED = 7 JOINED = 8 # User joined project LEFT = 9 # User left project
So, it was a problem... Here could be image with angry Cat, but we started new Events feature.
We had next requirements:
- Any entity in system can be Event target (entity, related to which was created event)
- Any entity can be Event source (entity, which triggered event)
- Ability to describe action with human like name
- Store event data to future usage of them
As result we have some solution: Any entity can be target
Any Entity can be source
And rich action description (part or them):
GENERAL = [ :created, :updated, :commented, :deleted, :added, :removed, :joined, :left, :transfer, ] COMMENTS = [ :commented_merge_request, :commented_issue, # not used in our fork # because we use another issue tracker :commented_commit, :commented, # not used after remove Project Wall ] MERGE_REQUESTS = [ :opened, :closed, :reopened, :merged, :assigned, :reassigned, :resigned, ] MASS = [ :imported, :members_added, :members_updated, :members_removed, :teams_added, :teams_removed, :groups_added, :projects_added ] GIT = [ :pushed, :created_branch, :deleted_branch, :created_tag, :deleted_tag, :protected, :unprotected, :blocked, :activate, ] # NOTE actions which can be parent BASE = [ :create, :update, :delete, :open, :close, :reopen, :merge, :block, :activate ]
Event can have parent event. For example case:
- User pushed code to server
- After push was created note
- And was closed MergeRequest
- And was closed Issue
We create events for any actions in system. But save parent-child relation.
Push created |- Event(action: pushed) |- Event(action: commented_commit) |- Event(action: closed) # for MergeRequest | |- Event(action: created) # Note was created | |- Event(action: commented_merge_request) | |- Event(action: closed) # for Issue |- Event(action: created) # Note was created |- Event(action: commented_issue)
Based on this events we can send email for different subscriptions without duplications. And we can research source of some troubles. I think it awesome! :)
After rewrite events we have:
- Ability detect who and what was doing in Gitlab on fuckups
- Full information dashboard for Main and Project, Group, Team, User pages.
- Flexible mail subscriptions with notifications
- Ability to send mail digests
Awesome mail notifications
I am not exaggerating. Why I wrote some header you understand later.
After rewriting events we created own mail notifications.
- User subscribe to base entity
- User can edit detailed settings for any subscriptions
- Another user do something in Gitlab
- After user actions - Gitlab trigger event creating process, and we have tree of events (described above)
- On created events base we create notifications for subscribers
- And send mails to subscribers in async queue.
At this moment we have more 100 different mails for different cases. And can be more :)
TODO: Our plans create notification page with option to show notifications or send them on email. TODO: Add ability to subscribe on MergeRequest and Issue. Replace participants with auto subscriptions. TODO: Add ability to create ignore subscriptions.
New Services logic
Gitlab has integration with different services and it's OK, but it implementation is ok for them, as SaaS. Why? In Gitlab code developers describe fields and logic of services. If we open project services page - Gitlab create empty records for everyone enabled in Gitlab service. For really needed service user enter some settings. Every time some similar settings user fill for different projects. It's so sadly... Service can only send web hooks. But, what about provide access to project code? Write comments/issue?
We rewrote services, because:
- For one organization it is ugly to write some configs many time while service added to different projects. We want to write config one time and enable service for different projects many time.
- We want to have ability for add deploy key for service.
- We want to have ability for integration of Gitlab with different other systems
- In Admin panel we add Service pattern with default values
- We can add sha-key for service (like deploy key)
- In Any project we can enable this service pattern (we can edit config, if it need)
If you create service pattern without default settings - user must fill service settings every time. Like in upstream %).
Elasticsearch as search engine
What about search functionality in Gitlab?
In official Gitlab CE we can search:
- Groups (autocomplete)
- MergeRequests (In selected project)
- Issues (In selected project)
- Code (in selected project)
Projects, Groups, MergeRequest, Issue -
At this moment Gitlab core team prefer PostgreSQL. So, in PostgreSQL we have good full-text search, but. We want more flexible search (sometimes user search with mistakes in query). And code search across all repositories. On this reason we replaced search with ElasticSearch.
Example of results:
We can search in different entities:
We can search code across different repositories (as you can see - we can search filter search with different Language)
Resume: At this moment search available in:
- etc (if you want)
- Code across all repositories
- Files by content
- Files by file-path and file-name
- Commits across all repositories
- Commit author name, email
- Commiter name, email
- Commit message
- commit sha
For integration with ElasticSearch I wrote gem. I tried save interface, but it is impossible without code rewrite. We have plans to create PR into official repository, but not in the fast time.
So historically, that our company uses Jenkins. And we started research, how we can integrate Gitlab with Jenkins.
We found Gitlab Hook plugin.
Ok. Jenkins + plugin -> Gitlab...
Ok. A lot of projects.... Huge count of MergeRequests... As result we get fulltime pulling. How do you think that happened in the end? Yes. Gitlab is Die. Sadly.
How it work now:
-> User pushed code |- Gitlab run services hooks |- Gitlab service select data and send them to Jenkins |- Jenkins run build and after them send build data JSON to Gitlab |- Gitlab parse data and show them in Web UI.
As result we know - which push broke code and can fix them in short time.
And such results in Merge Requests :)
Favorited projects (entities)
Many developers are involved in many projects. Find some of the projects is difficult. Necessary to use the search. This extra step and inconvenience. We decided to somehow identify projects in which the user is currently involved, or watched over. Result of brainstorm - we started Favorited projects feature.
User can add project in Feature list.
When user added project to this list - marked projects rendered in top of projects in dashboard sidebar:
And user can filter dashboard events feed. Show events for only favorited projects.
List of favorited projects can be edited in profile section.
And all this features available for Group, Teams and Users.
More performance with websockets
We replaces pulling for new notes in Merge Request, Commits and Issues with websockets. PR here.
Access to files via token
Git protocol support
- Run application
- Work with code
- Deploy application
. ├── some_git_home_path │ └── git │ ├── .ssh │ ├── gitlab-shell # symlink to /some/apps_path/gitlab-shell/current │ ├── ... │ └── repositories ├── some │ └── apps_path │ ├── gitlab │ │ ├── releases │ │ │ ├── release_1 │ │ │ ├── release_2 │ │ │ ├── release_3 │ │ │ ├── release_4 │ │ │ └── release_5 # here only code │ │ ├── shared │ │ │ ├── bin │ │ │ ├── bundle │ │ │ ├── cache │ │ │ ├── gitlab-satellites │ │ │ ├── log │ │ │ ├── pids │ │ │ ├── public │ │ │ ├── .secret │ │ │ ├── system │ │ │ ├── tmp │ │ │ └── uploads │ │ └── current # symlink to last release │ └── gitlab-shell │ ├── releases │ │ ├── release_1 │ │ ├── release_2 │ │ ├── release_3 │ │ ├── release_4 │ │ └── release_5 │ └── current # symlink to last release └── some_services_path ├── gitlab-resque-main ├── gitlab-resque-gitlab-shell ├── gitlab-resque-main ├── gitlab-resque-elasticsearch ├── gitlab-web-unicorn ├── gitlab-web-unicorn-api ├── gitlab-web-faye └── etc