Skip to content

add parallel view priority and parallel execution of multiple jobs#5908

Closed
dusans wants to merge 2 commits into
ipython:masterfrom
dusans:master
Closed

add parallel view priority and parallel execution of multiple jobs#5908
dusans wants to merge 2 commits into
ipython:masterfrom
dusans:master

Conversation

@dusans
Copy link
Copy Markdown

@dusans dusans commented May 27, 2014

This enables the user to define a priority on a load_balanced_view.

There are 4 levels. The levels are taken from a sql server.
Note: It doesn't block out lower priority jobs

PRIORITY_CRITICAL  Highest priority user
PRIORITY_HIGH      These jobs take precedences over normal jobs
PRIORITY_NORMAL    Default operation level for all jobs
PRIORITY_LOW       Lowest priority user jobs, background
                   loads, jobs that should not affect other activity

Example:

PRIORITY = util.PRIORITY_NORMAL

c = Client(profile='default')
lview = c.load_balanced_view()
lview.set_flags(priority=PRIORITY)

It also executes multiple jobs in parallel. The current implementations executes all the tasks of the first job that was submitted.

Comment thread IPython/parallel/util.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be a set literal ({ instead of ()

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.python.org/2/library/stdtypes.html#set-types-set-frozenset

Being an unordered collection, sets do not record element position or order of insertion. 

I need it to be sorted because of #5908 (diff)

@minrk
Copy link
Copy Markdown
Member

minrk commented May 28, 2014

Thanks for this PR, priority is a much requested feature. A few points on the PR:

  • Please avoid large whitespace changes. Please rebase to remove these changes.
  • Priority should only be associated with individual tasks, not clients.
  • Each session ID should not get its own queue - I shouldn't be able to jump to the head of the line, just by creating a new Client object. There should only be a single queue per priority level.
  • It's probably better to use larger numbers for the priority (e.g. 10, 20 instead of 1, 2), so there's space to fill in.

I'll make a few more comments in-line.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with only one queue per priority, this can be simpler:

sum(len(queue) for queue in self.queues.values())

@dusans
Copy link
Copy Markdown
Author

dusans commented May 29, 2014

If we have a separated queue for each job we can execute them in parallel.
I implemented it in a way that is known to me (Data-warehousing - Netezza DB server)
http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.adm.doc%2Fc_sysadm_pqe.html

In my practical experience u have jobs that run for weeks/days/hours. And if a user submits a new job that he knows takes only a few minutes he doesn't want to wait. U can't have a task that runs for days block all the others.

Using priorities should only be used in rare cases. The scheduler should take care of executing jobs in parallel that will make most of the users happy. Having to many priorities doesn't help i think.
In a real world environment u can't have the user decide what should be the priority of his job. This is a job for an admin. Because if u have 20 users each user thinks that his job is critical.

One feature i was thinking about for another PR is that if the scheduler knows that a job will actually take only a few minutes put that job in front of the jobs that are already running for hours. This is called Short query bias in Netezza.

http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.adm.doc%2Fc_sysadm_pqe.html

I don't know if this model fits to IPython-parallel.
http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.adm.doc%2Fc_sysadm_pqe.html
What do u think? :)

@minrk
Copy link
Copy Markdown
Member

minrk commented May 29, 2014

If we have a separated queue for each job we can execute them in parallel.

I don't know what you mean by parallel here. The number of queues doesn't affect how many jobs can be run simultaneously, it only affects the order in which tasks are assigned.

And if a user submits a new job that he knows takes only a few minutes he doesn't want to wait.

If I have submitted 100 tasks that take 10 minutes, then an hour later you submit two tasks that take 5 days, your tasks will run before mine – just as fast tasks from different Clients will preempt slow ones, slow ones will preempt fast ones.

In a real world environment u can't have the user decide what should be the priority of his job.

In IPython.parallel, the admin of the scheduler and the submitter of jobs are generally the same user.

The IPython scheduler is not aimed at similar use cases to large scale batch schedulers like PBS. IPython.parallel is not a multi-user environment. Note that all tasks have access to the memory of all other tasks in IPython.parallel. The use case for IPython.parallel in a cluster context is one IPython controller/engines per PBS/SGE job – one user, relatively short-lived. There are not multiple users to attempt to satisfy. Multiple users on one cluster should have multiple IPython schedulers - one per user.

One feature i was thinking about for another PR is that if the scheduler knows that a job will actually take only a few minutes put that job in front of the jobs that are already running for hours. This is called Short query bias in Netezza.

It will not be possible to preempt already running tasks, but it will be able to jump to the head of the line of those that are waiting. I'm not sure if this is appropriate or not for IPython, but I'm guessing that it is not.

@dusans
Copy link
Copy Markdown
Author

dusans commented May 29, 2014

I don't know what you mean by parallel here. The number of queues doesn't affect how many jobs can be run simultaneously, it only affects the order in which tasks are assigned.

If u have only one queue for priority normal and

  • U submit 1000 tasks (20 min for one task)
  • and i submit 10 tasks (5 min for one task)

My tasks are appended into the queue behind your 1000. That means all my 10 tasks have to wait for your 1000 tasks to poped out of the queue.

@minrk
Copy link
Copy Markdown
Member

minrk commented May 29, 2014

My tasks are appended into the queue behind your 1000. That means all my 10 tasks have to wait for your 1000 tasks to popped out of the queue.

Right, it's a single queue. I'm not sure it makes sense for creating a new Client object to be able to jump ahead of prior submissions, just because it's a different Client object. Under normal circumstances, different users will not share a controller, so both Client objects will be me, just at different points in time. I would expect that to just be queued.

@dusans
Copy link
Copy Markdown
Author

dusans commented May 29, 2014

Yes my direction was wrong in this PR. This isn't how ipython-parallel was designed to be used.

I was implementing all this for multiple users in mind. Since this is what we need.

@minrk
Copy link
Copy Markdown
Member

minrk commented May 29, 2014

Can you describe your use case in more detail?

@dusans
Copy link
Copy Markdown
Author

dusans commented May 30, 2014

We have around 20 users (I'm one of the admins) that run simulations on 300 cores. These take days maybe weeks.

Sometimes about an hour for quick tests.

I'm testing if we can also use ipython in our environment.

@dusans
Copy link
Copy Markdown
Author

dusans commented Jun 2, 2014

I can implement your suggestions to finish up the PR.

But i think its also for a single user version of benefit having jobs executed in parallel (inside one priority). This would make the users work easier, since he would have to always set a priority when running jobs in parallel.

@minrk
Copy link
Copy Markdown
Member

minrk commented Jun 3, 2014

If you want to finish up this PR, that would be great. If so, I do think multiple queues per priority should be removed.

@minrk
Copy link
Copy Markdown
Member

minrk commented Aug 26, 2014

Closing as dormant. Feel free to open a new PR, if you want to finish addressing the review.

@minrk minrk closed this Aug 26, 2014
@minrk minrk modified the milestone: no action Oct 3, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants